CN117278428A

CN117278428A - Metric set for software defined network architecture

Info

Publication number: CN117278428A
Application number: CN202211526327.3A
Authority: CN
Inventors: C·刘; P·米里亚拉; J·S·马歇尔
Original assignee: Juniper Networks Inc
Current assignee: Juniper Networks Inc
Priority date: 2022-06-20
Filing date: 2022-11-30
Publication date: 2023-12-22

Abstract

Embodiments of the present disclosure relate to a set of metrics for a software defined network architecture. In general, techniques are described for efficiently deriving metric data within a Software Defined Network (SDN) architecture. A network controller for a Software Defined Networking (SDN) architecture system including processing circuitry may implement the techniques. A telemetry node configured to be executed by processing circuitry may process a request by which a set of metrics defining a subset of metrics from a plurality of metrics can be derived from a computing node. The telemetry node may also convert the subset of one or more metrics to telemetry exporter configuration data based on the request to enable the set of metrics, the telemetry exporter configuration data configuring a telemetry exporter deployed at the computing node to export the subset of metrics. The telemetry node may also interface with a telemetry exporter to configure the telemetry exporter to export a subset of the metrics based on telemetry exporter configuration data.

Description

Metric set for software defined network architecture

The present application claims the right of U.S. patent application Ser. No.17/933,566, filed on month 9 of 2022, which claims the right of U.S. provisional patent application Ser. No.63/366,671, filed on month 6 of 2022, each of which is incorporated herein by reference in its entirety.

Technical Field

The present disclosure relates to virtualized computing infrastructure, and more particularly, to cloud native (cloud native) networking.

Background

In a typical cloud data center environment, there is a large collection of interconnected servers that provide computing and/or storage capacity to run various applications. For example, a data center may include facilities that host applications and services for subscribers (i.e., customers of the data center). The data center may, for example, host all infrastructure equipment such as networking and storage systems, redundant power supplies, and environmental controls. In a typical data center, clusters of storage systems and application servers are interconnected via a high-speed switching fabric provided by one or more layers of physical network switches and routers. More complex data centers provide user support equipment located in various physical host facilities for infrastructure throughout the world.

Virtualized data centers are becoming the core foundation of modern Information Technology (IT) infrastructure. In particular, modern data centers have widely utilized virtualized environments in which virtual hosts, also referred to herein as virtual execution elements, such as virtual machines or containers, are deployed and executed on the underlying computing platform of a physical computing device.

Virtualization within a data center or any environment that includes one or more servers may provide several advantages. One advantage is that virtualization may provide a significant improvement in efficiency. With the advent of multi-core microprocessor architectures with a large number of cores per physical CPU, the underlying physical computing devices (i.e., servers) become more powerful and virtualization becomes easier and more efficient. A second advantage is that virtualization provides significant control over computing infrastructure. As physical computing resources become alternative resources, such as in a cloud-based computing environment, provisioning and management of computing infrastructure becomes easier. Thus, in addition to the efficiency and increased Return On Investment (ROI) provided by virtualization, enterprise IT personnel often prefer the management advantages of virtualized computing clusters in a data center.

Containerization is a virtualization scheme based on operating system level virtualization. Containers are lightweight and portable actuators for applications that are isolated from each other and from the host. Because the container is not tightly coupled to the host hardware computing environment, the application may be bound to the container image and executed as a single lightweight enclosure on any host or virtual host supporting the underlying container architecture. In this way, the container solves the problem of how to make software work in different computing environments. Containers may execute consistently, virtually, or physically, from one computing environment to another.

Because of the inherently lightweight nature of containers, a single host can typically support many more container instances than a traditional Virtual Machine (VM). Containers are typically short lived (compared to most VMs), can be created and moved more efficiently than VMs, and they can also be managed as groups of logically related elements (sometimes referred to as-pod's for some orchestration platforms, e.g., kubernetes). These container characteristics affect the requirements of the container networking solution: the network should be flexible and scalable. The VM, container, and bare metal server may need to coexist in the same computing environment where communication is enabled between different deployments of the application. The container network should also be agnostic to working with multiple types of orchestration platforms for deploying containerized applications.

The computing infrastructure that manages deployment and the infrastructure for application execution may involve two main roles: (1) Orchestration—deployment, scaling, and operation for automating applications across host clusters, and providing computing infrastructure, which may include container-centric computing infrastructure; and (2) network management-creating a virtual network in a network infrastructure to enable packetized communications between applications running on a virtual execution environment, such as a container or VM, and between applications running on a legacy (e.g., physical) environment. Software defined networking facilitates network management.

In terms of network management, a large amount of metric data may be provided to facilitate a better understanding of how the network operates. In some aspects, such metric data may enable a network operator (or network administrator in other operations) to understand how the network is operating. While such metric data is valuable for addressing failures of network operations, significant network resources may be required in terms of pod requirements to collect and transmit (or, in other words, source) such metric data, which may consume significant network resources to collect and transmit metric data.

Disclosure of Invention

In general, techniques are described for enabling efficient collection of metric data in a Software Defined Network (SDN) architecture. The network controller may implement telemetry nodes configured to provide an abstraction called a set of metrics that facilitates low granularity and high granularity in allowing only a subset of the metric data to be collected. Instead of indifferently collecting all metric data and outputting all possible metric data, a telemetry node may define a set of metrics, which may define a subset of all possible metric data (in which case the subset refers to a non-zero subset, rather than a mathematical abstraction, where a subset may include zero or more, including all metrics).

The telemetry node may provide an Application Programming Interface (API) server through which a request to define a set of metrics is received, which may be independently enabled or disabled. In other words, the set of metrics enables or disables a subset of the metric data at a low level of granularity. In each of the metric groups, the API server may also receive a request to enable or disable a separate set of metric data in the subset of metric data defined by the metric group. The network operator may then interface with the telemetry node, e.g., via a user interface, to select one or more metric sets to enable or disable corresponding subsets of metric data defined by the metric sets, wherein such metric sets may be arranged (potentially hierarchically) according to various topics (e.g., border gateway protocol-BGP, internet protocol version 4-IPv4, IPv6, virtual router traffic, multicast virtual private network-MVPN, etc.).

The telemetry node may define the set of metrics as custom resources within a container orchestration platform for implementing the network controller, convert one or more sets of metrics into a configuration map defining (e.g., as an array) enabled metrics (while possibly also removing overlapping metrics to prevent redundant collection of metric data). The telemetry node may then interface with the identified telemetry exporter to configure the telemetry exporter based on telemetry exporter configuration data to collect and export only metrics that are enabled for collection.

These techniques may provide additional one or more technical advantages. For example, the techniques may improve operation of the SDN architecture by reducing resource consumption when collecting and outputting metric data. Assuming that not all metric data is collected and exported, but only a select subset is collected and exported, the telemetry exporter may use less processor cycles, memory bandwidth, and associated power to collect metric data (less than all metrics) associated with the metric subset. Furthermore, the telemetry exporter may only export a subset of the metrics, which results in less consumption of network bandwidth of the SDN architecture, including processing resources, memory bandwidth, and associated power that process telemetry data within the SDN architecture. Furthermore, telemetry nodes receiving the output metric data may process the output metric data with less computational resources (again, processor cycles, memory bandwidth, and associated power), again assuming that these metric data correspond only to the enabled metric set.

As another example, by defining a set of metrics using custom resources that facilitate extraction of underlying configuration data to define a subset of metrics for each classified and/or locally arranged set of metrics, a network administrator may more easily interface with telemetry nodes to customize the collection of metrics. Since these network administrators may not have extensive experience with the container orchestration platform, such abstractions provided by the set of metrics may facilitate a more intuitive user interface with which to interact to customize the derivation of the metric data, which may result in fewer network administrator errors that would otherwise consume computing resources.

In one example, aspects of the technology relate to a network controller for a Software Defined Networking (SDN) architecture system, the network controller comprising: processing circuitry; a telemetry node configured to be executed by the processing circuitry, the telemetry node configured to: processing a request by which to enable a set of metrics to be derived from computing nodes of the cluster managed by the network controller, the set of metrics defining a subset of one or more metrics from the plurality of metrics; converting the subset of the one or more metrics into telemetry exporter configuration data based on the request to enable the set of metrics, the telemetry exporter configuration data configuring a telemetry exporter deployed at the computing node to export the subset of the one or more metrics; and interfacing with the telemetry exporter to configure the telemetry exporter to export a subset of the one or more metrics based on telemetry exporter configuration data.

In another example, aspects of the technology relate to a computing node in a Software Defined Networking (SDN) architecture system, the Software Defined Networking (SDN) architecture system comprising: processing circuitry configured to execute a compute node forming part of the SDN architecture system, wherein the compute node is configured to support a virtual network router and to execute a telemetry exporter, wherein the telemetry exporter is configured to: receiving telemetry exporter configuration data defining a subset of one or more metrics of a plurality of metrics to be exported to a telemetry node performed by a network controller; collecting metric data corresponding to only a subset of the one or more metrics of the plurality of metrics based on telemetry exporter configuration data; and outputting metric data corresponding to only a subset of the one or more metrics to the telemetry node.

In another example, aspects of the technology relate to a method for a Software Defined Networking (SDN) architecture system, the method comprising: processing a request by which to enable a set of metrics defining a subset of one or more metrics from a plurality of metrics to be derived from the defined one or more computing nodes forming the cluster; converting the subset of one or more metrics into telemetry exporter configuration data based on a request to enable a set of metrics, the telemetry exporter configuration data configuring telemetry exporters deployed at one or more computing nodes to export the subset of one or more metrics; and interfacing with a telemetry exporter to configure the telemetry exporter to export a subset of the one or more metrics based on telemetry exporter configuration data.

In another example, aspects of the present technology relate to a method for a Software Defined Networking (SDN) architecture system, comprising: receiving telemetry exporter configuration data defining a subset of one or more metrics of a plurality of metrics to be exported to a telemetry node performed by a network controller; collecting metric data corresponding to only a subset of the one or more metrics of the plurality of metrics based on telemetry exporter configuration data; and outputting, to the telemetry node, metric data corresponding to only a subset of the one or more metrics of the plurality of metrics.

In another example, aspects of the technology relate to a Software Defined Networking (SDN) architecture system comprising: a network controller configured to execute a telemetry node, the telemetry node configured to: processing a request by which to enable a set of metrics to be derived from the defined one or more logically related elements, the set of metrics defining a subset of one or more metrics from a plurality of metrics; converting the subset of the one or more metrics into telemetry exporter configuration data based on the request to enable the set of metrics, the telemetry exporter configuration data configuring telemetry exporters deployed at the one or more logically related elements to export the subset of the one or more metrics; and interfacing with the telemetry exporter to configure the telemetry exporter to export a subset of the one or more metrics based on the telemetry exporter configuration data; and logic configured to support the virtual network router and execute a telemetry exporter, wherein the telemetry exporter is configured to: receiving telemetry exporter configuration data; collecting metric data corresponding to only a subset of the one or more metrics of the plurality of metrics based on the telemetry exporter configuration data; and outputting, to the telemetry node, metric data corresponding to only a subset of the one or more metrics of the plurality of metrics.

In another example, aspects of the technology are directed to a non-transitory computer-readable storage medium having instructions stored thereon that, when executed, cause one or more processors to: processing a request by which to enable a set of metrics to be derived from the defined one or more computing nodes forming the cluster, the set of metrics defining a subset of one or more metrics from the plurality of metrics; converting the subset of the one or more metrics into telemetry exporter configuration data based on the request to enable the set of metrics, the telemetry exporter configuration data configuring telemetry exporters deployed at the one or more computing nodes to export the subset of the one or more metrics; and interfacing with the telemetry exporter to configure the telemetry exporter to export a subset of the one or more metrics based on the telemetry exporter configuration data.

In another example, aspects of the technology are directed to a non-transitory computer-readable storage medium having instructions stored thereon that, when executed, cause one or more processors to: receiving telemetry exporter configuration data defining a subset of one or more metrics of a plurality of metrics to be exported to a telemetry node performed by a network controller; collecting metric data corresponding to only a subset of the one or more metrics of the plurality of metrics based on the telemetry exporter configuration data; and outputting, to the telemetry node, metric data corresponding to only a subset of the one or more metrics of the plurality of metrics.

The details of one or more examples of the disclosure are set forth in the accompanying drawings and the description below. Other features, objects, and advantages will be apparent from the description and drawings, and from the claims.

Drawings

FIG. 1 is a block diagram illustrating an example computing infrastructure in which examples of the techniques described herein may be implemented.

Fig. 2 is another view and a more detailed block diagram illustrating components of an SDN architecture in accordance with the techniques of this disclosure.

Fig. 3 is a block diagram illustrating example components of an SDN architecture in accordance with the techniques of this disclosure.

Fig. 4 is a block diagram illustrating example components of an SDN architecture in accordance with the techniques of this disclosure.

Fig. 5A is a block diagram illustrating a control/routing plane for an underlying network and overlay network configuration using SDN architecture in accordance with the techniques of the present invention.

Fig. 5B is a block diagram of a virtual network showing a configuration of connecting pod using a tunnel configured in an underlying network in accordance with the techniques of the present disclosure.

Fig. 6 is a block diagram illustrating an example of a custom controller for custom resources of an SDN architecture configuration in accordance with the techniques of the present invention.

Fig. 7 is a block diagram illustrating the telemetry node and telemetry exporter of fig. 1-5A in more detail.

FIG. 8 is a flow chart illustrating operation of the computer architecture shown in the example of FIG. 1 in performing aspects of the technology described herein.

Like reference numerals refer to like elements throughout the specification and drawings.

Detailed Description

FIG. 1 is a block diagram illustrating an example computing infrastructure 8 in which examples of the techniques described herein may be implemented. The computing infrastructure 8 includes a cloud-native SDN architecture system that addresses these challenges and modernizes the telecom cloud age as examples described herein, example use cases of the cloud-native SDN architecture include 5G mobile networks and cloud and enterprise cloud use cases.

Thus, the SDN architecture component is a micro-service and, in contrast to existing network controllers, the SDN architecture employs a basic container orchestration platform to manage the lifecycle of the SDN architecture component. Constructing an SDN architecture component using a container orchestration platform; the SDN architecture uses a cloud native monitoring tool that can be integrated with customer-provided cloud native options; the SDN architecture uses the aggregate APIs of SDN architecture objects (i.e., custom resources) to provide a declarative manner of resources. SDN architecture upgrades may follow cloud native patterns, and SDN architecture may utilize Kubernetes constructs, such as Multus, authentication & Authorization, clusterAPI, kubefederation, kubevirt and Kata containers. The SDN architecture may support a Data Plane Development Kit (DPDK) pod and may extend to support Kubernetes with virtual network policies and global security policies.

For service providers and enterprises, SDN architecture automates network resource provisioning and organization to dynamically create highly scalable virtual networks and links Virtualized Network Functions (VNFs) and Physical Network Functions (PNFs) to form differentiated service chains as needed. The SDN architecture may be integrated with orchestration platforms such as Kubernetes, openshift, mesos, openstack, VMwarevSphere (e.g., orchestrator 23) and service provider operations support systems/business support systems (OSS/BSS).

In general, one or more data centers 10 provide an operating environment for applications and services for customer sites 11 (illustrated as "customers 11"), customer sites 11 having one or more customer networks coupled to the data centers through a service provider network 7. Each of the data centers 10 may be, for example, host infrastructure devices such as networking and storage systems, redundant power supplies, and environmental controls. The service provider network 7 is coupled to a public network 15, which public network 15 may represent one or more networks managed by other providers, and thus may form part of a large-scale public network infrastructure (e.g. the internet). Public network 15 may represent, for example, a Local Area Network (LAN), wide Area Network (WAN), the internet, a Virtual LAN (VLAN), an enterprise LAN, a layer 3 Virtual Private Network (VPN), an Internet Protocol (IP) intranet operated by a service provider operating service provider network 7, an enterprise IP network, or some combination thereof.

Although the customer site 11 and public network 15 are primarily shown and described as edge networks of the service provider network 7, in some examples, one or more of the customer site 11 and public network 15 may be a tenant network within any of the data centers 10. For example, the data center 10 may host multiple tenants (customers), each associated with one or more Virtual Private Networks (VPNs), each of which may implement one of the customer sites 11.

The service provider network 7 provides packet-based connectivity to additional customer sites 11, data centers 10 and public networks 15. The service provider network 7 may represent a network owned and operated by a service provider to interconnect multiple networks. The service provider network 7 may implement multiprotocol label switching (MPLS) forwarding and in this case may be referred to as an MPLS network or MPLS backbone. In some cases, service provider network 7 represents a plurality of interconnected autonomous systems, such as the Internet, that provide services from one or more service providers.

In some examples, each of the data centers 10 may represent one of a number of geographically distributed network data centers that may be connected to each other via a service provider network 7, a dedicated network link, a dark fiber, or other connection. As shown in the example of fig. 1, the data center 10 may include facilities for providing network services to customers. The clients of the service provider may be collective entities such as businesses and governments or individuals. For example, a network data center may provide network services for several businesses and end users. Other exemplary services may include data storage, virtual private networks, business engineering, file services, data mining, scientific or super computing, and the like. Although illustrated as a separate edge network of the service provider network 7, elements of the data center 10, such as one or more Physical Network Functions (PNFs) or Virtualized Network Functions (VNFs), may be included within the core of the service provider network 7.

In this example, the data center 10 includes storage and/or computing servers (or "nodes") interconnected via a switching fabric 14 provided by one or more layers of physical network switches and routers, with servers 12A-12X (herein "servers 12") being depicted as coupled to rack top switches 16A-16N. The server 12 is a computing device and may also be referred to herein as a "computing node," host, "or" host device. Although only server 12A coupled to TOR switch 16A is shown in detail in fig. 1, data center 10 may include many additional servers coupled to other TOR switches 16 of data center 10.

The switching fabric 14 in the illustrated example includes interconnected top of rack (TOR) (or other "leaf") switches 16A-16N (collectively, "TOR switches 16") that are coupled to a distribution layer of rack (or "spine" or "core") switches 18A-18M (collectively, "rack switches 18"). Although not shown, the data center 10 may also include, for example, one or more non-edge switches, routers, hubs, gateways, security devices such as firewalls, intrusion detection and/or prevention devices, servers, computer terminals, laptops, printers, databases, wireless mobile devices such as cellular telephones or personal digital assistants, wireless access points, bridges, cable modems, application accelerators, or other network devices. The data center 10 may also include one or more Physical Network Functions (PNFs), such as physical firewalls, load balancers, routers, route reflectors, broadband Network Gateways (BNGs), mobile core network elements, and other PNFs.

In this example, TOR switches 16 and chassis switches 18 provide redundant (multi-homed) connections to the IP fabric 20 and service provider network 7 to servers 12. The chassis switches 18 aggregate traffic flows and provide connectivity between TOR switches 16. TOR switches 16 may be network devices that provide layer 2 (MAC) and/or layer 3 (e.g., IP) routing and/or switching functions. The TOR switch 16 and the chassis switch 18 may each include one or more processors and memory, and may execute one or more software processes. The chassis switch 18 is coupled to an IP fabric 20, which IP fabric 20 may perform layer 3 routing to route network traffic between the data center 10 and customer sites 11 through the service provider network 7. The switching architecture of the data center 10 is merely an example. For example, other switching structures may have more or fewer switching layers. IP fabric 20 may include one or more gateway routers.

The term "packet flow", "traffic flow", or simply "flow" refers to a group of packets originating from a particular source device or endpoint and sent to a particular destination device or endpoint. A single packet stream may be identified by a 5-tuple: < source network address, destination network address, source port, destination port, protocol >, for example. The 5-tuple typically identifies the packet stream to which the received packet corresponds. An n-tuple refers to any n items extracted from a 5-tuple. For example, a 2-tuple of a packet may refer to a combination of < source network address, destination network address > or < source network address, source port > of the packet.

The servers 12 may each represent a computing server or a storage server. For example, each of the servers 12 may represent a computing device configured to operate in accordance with the techniques described herein, such as an x86 processor-based server. The server 12 may provide a Network Function Virtualization Infrastructure (NFVI) for the NFV architecture.

Any of the servers 12 may be configured with virtual execution elements such as pod or virtual machines by virtualizing the resources of the servers to provide some measure of isolation between one or more processes (applications) executing on the servers. "hypervisor-based" or "hardware level" or "platform" virtualization refers to creating virtual machines, each virtual machine including a guest operating system for executing one or more processes. Typically, virtual machines provide a virtualized/guest operating system for executing applications in a quarantined virtual environment. Because the virtual machine is virtualized from the physical hardware of the host server, the executing application is isolated from the hardware of the host and other virtual machines. Each virtual machine may be configured with one or more virtual network interfaces for communicating over a corresponding virtual network.

Virtual networks are logical constructs implemented on top of physical networks. Virtual networks may be used to replace VLAN-based quarantine and provide multi-tenancy in virtualized data centers (e.g., data center 10). Each tenant or application may have one or more virtual networks. Each virtual network may be isolated from all other virtual networks unless explicitly allowed by security policies.

The virtual network may be connected to and extended over a physical multiprotocol label switching (MPLS) layer 3 virtual private network (L3 VPN) and an Ethernet Virtual Private Network (EVPN) using a data center 10 gateway router (not shown in fig. 1). Virtual networks may also be used to implement Network Function Virtualization (NFV) and service chaining.

The virtual network may be implemented using various mechanisms. For example, each virtual network may be implemented as a Virtual Local Area Network (VLAN), a Virtual Private Network (VPN), or the like. The virtual network may also be implemented using two networks-a physical underlay network and a virtual overlay network consisting of IP fabric 20 and switch fabric 14. The role of the physical underlay network is to provide an "IP fabric" that provides unicast IP connectivity from any physical device (server, storage device, router or switch) to any other physical device. The underlying network may provide uniform low latency, non-blocking, high bandwidth connectivity from any point in the network to any other point in the network.

As described further below with respect to virtual router21 (shown and also referred to herein as "vruter 21"), virtual routers running in server 12 create a virtual overlay network over the physical underlay network using a dynamic "tunnel" grid between them. These overlay tunnels may be, for example, MPLS overlay GRE/UDP tunnels, or VXLAN tunnels, or NVGRE tunnels. The underlying physical routers and switches may not store any per-tenant state of the virtual machine or other virtual execution element, such as any Media Access Control (MAC) address, IP address, or policy. The forwarding tables of the underlying physical routers and switches may, for example, contain only the IP prefix or MAC address of the physical server 12. (the gateway router or switch that connects the virtual network to the physical network is an exception and may contain the tenant MAC or IP address).

The virtual router21 of the server 12 typically contains per-tenant states. For example, they may contain separate forwarding tables (routing instances) for each virtual network. The forwarding table contains the IP prefix (in the case of layer 3 coverage) or MAC address (in the case of layer 2 coverage) of the virtual machine or other virtual execution element (e.g., pod of the container). A single virtual router21 need not contain all IP prefixes or all MAC addresses of all virtual machines in the entire data center. A given virtual router21 need only contain those route instances that exist locally on the server 12 (i.e., it has at least one virtual execution element that exists on the server 12).

"Container-based" or "operating system" virtualization refers to the virtualization (virtual or physical) of an operating system running multiple isolation systems on a single machine. Such isolation systems represent containers such as those provided by the open source DOCKER Container application or Coreos Rkt ("rocket"). Like virtual machines, each container is virtualized and can remain isolated from hosts and other containers. However, unlike virtual machines, each container may omit a separate operating system, and instead provide application suites and application-specific libraries. In general, containers are executed by a host as isolated user space instances, and may share an operating system and a common library with other containers executing on the host. Thus, the container may require less processing power, storage, and network resources than a virtual machine ("VM"). A set of one or more containers may be configured to share one or more virtual network interfaces for communicating over a respective virtual network.

In some examples, containers are managed by their host kernel to allow limiting and prioritizing resources (CPUs, memory, block I/os, networks, etc.) without the need to launch any virtual machines, in some cases using namespace isolation functionality that allows complete isolation of application (e.g., given container) views of the operating environment, including process trees, networking, user identifiers, and installed file systems. In some examples, the container may be deployed according to a Linux container (LXC), which is an operating system level virtualization method for running multiple isolated Linux systems (containers) on a control host using a single Linux kernel.

Server 12 hosts virtual network endpoints for one or more virtual networks operating on the physical networks represented herein by IP fabric 20 and switch fabric 14. Although described primarily with respect to a data center-based switching network, other physical networks, such as the service provider network 7, may support one or more virtual networks.

Each server 12 may host one or more virtual execution units, each virtual execution unit having at least one virtual network endpoint for one or more virtual networks configured in a physical network. The virtual network endpoints of the virtual network may represent one or more virtual execution elements sharing virtual network interfaces of the virtual network. For example, a virtual network endpoint may be a virtual machine, a set of one or more containers (e.g., pod), or another virtual executive, such as a layer 3 endpoint for a virtual network. The term "virtual execution element" includes virtual machines, containers, and other virtualized computing resources that provide an at least partially independent execution environment for an application. The term "virtual execution element" may also include containers of one or more containers. The virtual execution unit may represent an application workload.

As shown in fig. 1, server 12A hosts one virtual network endpoint in the form of pod 22 having one or more containers. However, given the hardware resource limitations of the server 12, the server 12 may execute as many virtual execution elements as there are actual. Each virtual network endpoint may perform packet I/O or process packets using one or more virtual network interfaces. For example, the virtual network endpoint may use one virtual hardware component (e.g., SR-IOV virtual functions) enabled by NIC 13A to perform packet I/O and receive/transmit packets on one or more communication links with TOR switch 16A.

The servers 12 each include at least one Network Interface Card (NIC) 13, and the NICs 13 each include at least one interface to exchange packets with the TOR switch 16 over a communication link. For example, the server 12A includes the NIC 13A. Any NIC 13 may provide one or more virtual hardware components 21 for virtualizing input/output (I/O). The virtual hardware component for I/O may be a virtualization of a physical NIC ("physical function"). For example, in single root I/O virtualization (SR-IOV), the PCIe physical functions of a network interface card (or "network adapter") are virtualized by the PCIe physical functions of the virtualized network interface card (or "network adapter") to present one or more virtual network interfaces as "virtual functions" for use by the respective endpoints executing on server 12. In this way, the virtual network endpoints may share the same PCIe physical hardware resources, and the virtual functions are examples of virtual hardware components 21.

As another example, one or more servers 12 may implement Virtio, which may be used in a para-virtualization framework, such as a Linux operating system, that provides emulated NIC functionality as a type of virtual hardware component to provide virtual network interfaces to virtual network endpoints. As another example, one or more servers 12 may implement Open vswitches to perform distributed virtual multi-layer switching between one or more virtual NICs (vnics) for hosting virtual machines, where such vnics may also represent the types of virtual hardware components that provide virtual network interfaces to virtual network endpoints. In some cases, the virtual hardware component is a virtual I/O (e.g., NIC) component. In some examples, the virtual hardware component is an SR-IOV virtual function.

In some examples, any of servers 12 may implement a Linux bridge that emulates a hardware bridge and forwards packets between virtual network interfaces of the servers or between virtual network interfaces of the servers and physical network interfaces of the servers. For a Docker implementation of containers hosted by a server, a Linux bridge or other operating system bridge executing on the server that exchanges packets between containers may be referred to as a "Docker bridge. The term "virtual router" as used herein may include a Contrail or tungsten structured virtual router, an Open VSwitch (OVS), an OVS bridge, a Linux bridge, a Docker bridge, or other device and/or software located on a host device and performing switching, bridging, or routing packets between virtual network endpoints of one or more virtual networks, where the virtual network endpoints are hosted by one or more servers 12.

Any NIC 13 may include an internal device switch to exchange data between virtual hardware components associated with the NIC. For example, for an SR-IOV capable NIC, the internal device switch may be a Virtual Ethernet Bridge (VEB) for switching between SR-IOV virtual functions and, correspondingly, between endpoints configured to use SR-IOV virtual functions, where each endpoint may include a guest operating system. The internal device switch may also be referred to as a NIC switch, or for SR-IOV implementation, as a SR-IOVNIC switch. The virtual hardware component associated with NIC 13A may be associated with a layer 2 destination address that may be assigned by NIC 13A or a software process responsible for configuring NIC 13A. A physical hardware component (or "physical function" implemented by the SR-IOV) is also associated with the layer 2 destination address.

The one or more servers 12 may each include a virtual router 21, the virtual router 21 executing one or more routing instances for corresponding virtual networks within the data center 10 to provide virtual network interfaces and route packets between virtual network endpoints. Each routing instance may be associated with a network forwarding table. Each routing instance may represent a virtual routing and forwarding instance (VRF) for an internet protocol virtual private network (IP-VPN). For example, packets received by the virtual router 21 of the server 12A from the underlying physical network fabric (i.e., the IP fabric 20 and the switch fabric 14) of the data center 10 may include an outer header to allow the physical network fabric to tunnel a payload or "inner packet" to the physical network address of the network interface card 13A of the server 12A for executing the virtual router. The outer header may include not only the physical network address of the network interface card 13A of the server, but also a virtual network identifier, such as a vxlan label or a multiprotocol label switching (MPLS) label, that identifies one of the virtual networks and the corresponding routing instance performed by the virtual router 21. The inner packet includes an inner header having a destination network address conforming to a virtual network addressing space of the virtual network identified by the virtual network identifier.

The virtual router 21 terminates the virtual network overlay tunnel and determines the virtual network of the received packet based on the tunnel encapsulation header of the packet and forwards the packet to the appropriate destination virtual network endpoint of the packet. For server 12A, for example, for each packet outbound from a virtual network endpoint (e.g., pod 22) hosted by server 12A, virtual router 21 appends a tunnel encapsulation header of the virtual network that indicates the packet to generate an encapsulated or "tunnel" packet, and virtual router 21 outputs the encapsulated packet to a physical destination computing device, such as another one of servers 12, via an overlapping tunnel of the virtual network. As used herein, virtual router 21 may perform operations of tunnel endpoints to encapsulate internal packets originating from virtual network endpoints to generate tunnel packets, and decapsulate the tunnel packets to obtain internal packets for routing to other virtual network endpoints.

In some examples, virtual router 21 may be kernel-based and execute as part of the kernel of the operating system of server 12A.

In some examples, virtual router 21 may be a Data Plane Development Kit (DPDK) -enabled virtual router. In such an example, the virtual router 21 uses DPDK as a data plane. In this mode, the virtual router 21 operates as a user space application linked to a DPDK library (not shown). This is a performance version of the virtual router and is typically used by carriers, where VNFs are typically DPDK-based applications. The performance of the virtual router 21 as a DPDK virtual router can achieve ten times higher throughput than a virtual router operating as a kernel-based virtual router. The physical interface is used by the Polling Mode Driver (PMD) of DPDK instead of the interrupt-based driver of Linux kernel.

A user I/O (UIO) kernel module (e.g., vfio or uio_pci_geneic) may be used to expose registers of the physical network interface into the user space so that it is accessible by the DPDK PMD. When NIC 13A binds to the UIO driver, it moves from Linux kernel space to user space and is therefore no longer managed by the Linux OS nor visible. Thus, it is the DPDK application (i.e., the virtual router 21A in this example) that fully manages the NIC 13. This includes packet polling, packet processing, and packet forwarding. The user packet processing step may be performed by the virtual router 21DPDK data plane with limited or no participation by the kernel (where the kernel is not shown in fig. 1). This "polling mode" nature makes virtual router 21DPDK data plane packet processing/forwarding more efficient than interrupt mode, especially when the packet rate is high. There is limited or no interruption and context exchange during packet I/O. For further details of the DPDK vruter example see "DAYONE: CONTRAIL DPDKvROUTER, "2021, kiranKN et al, juniperworks, inc., which is incorporated herein by reference in its entirety.

The computing infrastructure 8 implements an automation platform for automating the deployment, scaling, and operation of virtual execution elements on the servers 12 to provide a virtualized infrastructure for executing application workloads and services. In some examples, the platform may be a container orchestration system that provides a container-centric infrastructure for automating the deployment, scaling, and manipulation of containers to provide a container-centric infrastructure. In the context of virtualized computing infrastructure, "orchestration" generally refers to provisioning, scheduling, and managing virtual execution elements and/or applications and services executing on such virtual execution elements to host servers available to an orchestration platform. Container orchestration may facilitate container reconciliation and refers to deploying, managing, scaling, and configuring containers to host servers, e.g., through a container orchestration platform. Examples of orchestration platforms include Kubernetes (container orchestration system), docker group, meso/Marathon, openshift, openstack, VMware, and AmazonECS.

The elements of the automation platform of the computing infrastructure 8 include at least a server 12, an orchestrator 23 and a network controller 24. The container may be deployed to the virtualized environment using a cluster-based framework in which a cluster master node of the cluster manages the deployment and operation of the container to one or more cluster slave nodes of the cluster. The terms "master node" and "small node" as used herein include different orchestration platform terms for similar devices that distinguish between the primary management element of the cluster and the primary container host device of the cluster. For example, the Kubernetes platform uses the terms "cluster master" and "small nodes", while the docker cluster platform refers to a cluster manager and cluster nodes.

The orchestrator 23 and the network controller 24 may execute on separate computing devices, on the same computing device. Each of orchestrator 23 and network controller 24 may be a distributed application executing on one or more computing devices. The orchestrator 23 and the network controller 24 may implement respective master nodes for one or more clusters, each cluster having one or more small nodes (also referred to as "compute nodes") implemented by respective servers 12.

In general, network controller 24 controls the network configuration of the data center 10 architecture to establish one or more virtual networks for packetized communications, for example, between virtual network endpoints. Network controller 24 provides a logical and, in some cases, physically centralized controller to facilitate the operation of one or more virtual networks within data center 10. In some examples, network controller 24 may operate in response to configuration inputs received from commander 23 and/or an administrator/operator. Additional information regarding example operations of the network controller 24 operating in conjunction with other devices or other software-defined networks of the data center 10 is found in international application number PCT/US2013/044378, filed on 5, 6, 2013, entitled "PHYSICAL PATH DETERMINATION FOR VIRTUAL NETWORK PACKET FLOWS"; and U.S. patent application Ser. No. 14/226,509, entitled "TUNNELED PACKET AGGREGATION FOR VIRTUAL NETWORKS," filed on even 26, 3, 2014, each of which is incorporated by reference as if fully set forth herein.

In general, orchestrator 23 controls the deployment, scaling, and operation of containers across clusters of servers 12, and provides computing infrastructure, which may include container-centric computing infrastructure. The orchestrator 23 and in some cases the network controller 24 may implement respective cluster masters for one or more Kubernetes clusters. As an example, kubernetes is a container management platform that provides portability across public and private clouds, each of which may provide a virtualization infrastructure to the container management platform. Example components of the Kubernetes orchestration system are described below with reference to fig. 3.

In one example, pod 22 is an example of a Kubernetes pod and virtual network endpoint. A container is a set of one or more logically related containers (not shown in fig. 1) for shared storage of the container, and options as to how to run the container. Where illustrated for execution, the pod is alternatively referred to as a "pod copy". Each container of pod 22 is an example of a virtual execution unit. Containers of containers are always co-located on a single server, co-scheduled, and run in a shared context. The shared context of pod may be a set of Linux namespaces, control groups (cgroups), and other isolation aspects.

In the context of pod, a separate application may apply additional child isolation. Typically, containers within a box have a public IP address and port space and are able to detect each other via a local host. Because they have shared context, containers within a container can also communicate with each other using inter-process communication (IPC). Examples of IPC include system level semaphores or POSIX shared memory. In general, containers that are members of different containers have different IP addresses, and cannot communicate through IPC without a configuration for implementing this feature. Containers that are members of different containers typically communicate with each other through the container IP address.

Server 12A includes a container platform 19 for running containerized applications such as pod 22. Container platform 19 receives the request from orchestrator 23 to obtain and host the container in server 12A. The container platform 19 obtains and executes containers.

The Container Network Interface (CNI) 17 configures a virtual network interface for the virtual network endpoint. The orchestrator 23 and container platform 19 use the CNI 17 to manage the networking of pod, including pod 22. For example, the CNI 17 creates a virtual network interface to connect a pod to the virtual router 21, and enables the container of such a pod to communicate with other virtual network endpoints through the virtual network via the virtual network interface. The CNI 17 may, for example, insert a virtual network interface for a virtual network into a network namespace of a container in the pod 22 and configure (or request configuration) the virtual network interface for the virtual network in the virtual router 21 such that the virtual router 21 is configured to send packets received from the virtual network via the virtual network interface to the container of the pod 22 and send packets received from the container of the pod 22 via the virtual network interface over the virtual network. The CNI 17 may assign a network address (e.g., a virtual IP address of a virtual network) and may establish a route for the virtual network interface.

In Kubernetes, by default, all pods can communicate with all other pods without using Network Address Translation (NAT). In some cases, orchestrator 23 and network controller 24 create a service virtual network and a pod virtual network that are shared by all namespaces, from which service and pod network addresses are assigned, respectively. In some cases, all of the pod's in all namespaces created in the Kubernetes cluster can communicate with each other, and the network addresses of all of the pod's can be assigned from the pod subnet specified by the orchestrator 23. When a user creates an isolated namespace for a pod, orchestrator 23 and network controller 24 can create a new pod virtual network and a new shared services virtual network for the new isolated namespace. The pod in the isolated namespace generated in the Kubernetes cluster extracts the network address from the new pod virtual network and the corresponding service of such pod extracts the network address from the new service virtual network.

CNI 17 may represent a library, plug-in, module, runtime, or other executable code for server 12A. The CNI 17 may at least partially conform to the Container Network Interface (CNI) specification or rkt networking recommendations. CNI 17 may represent an anti-trail, an open anti-trail, multus, calico, cRPD, or other CNI. Alternatively, the CNI 17 may be referred to as a network plug-in or CNI instance. The separate CNIs may be called by, for example, a Multus CNI to establish different virtual network interfaces for pod 22.

CNI 17 may be invoked by orchestrator 23. For the purposes of the CNI specification, a container may be considered synonymous with a Linux network namespace. This corresponds to what elements depend on the particular container runtime implementation: for example, in an implementation of an application container specification such as rkt, each pod runs in a unique network namespace. However, in dockers, there is typically a network namespace for each individual Docker container. For the purposes of the CNI specification, a network refers to a set of entities that are uniquely addressable and that can communicate between each other. This may be a separate container, a machine/server (real or virtual), or some other network device (e.g., a router). Containers may be conceptually added to or removed from one or more networks. The CNI specification specifies a number of considerations for a consistency plug-in ("CNI plug-in").

Pod 22 includes one or more containers. In some examples, pod 22 includes a containerized DPDK workload designed to use DPDK to accelerate packet processing, such as by exchanging data with other components using a DPDK library. In some examples, virtual router 21 may execute as a containerized DPDK workload.

The POD22 is configured with a virtual network interface 26 for transmitting and receiving packets with the virtual router 21. The virtual network interface 26 may be the default interface of the pod 22. Pod22 may implement virtual network interface 26 as an ethernet interface (e.g., named "eth 0"), while virtual router 21 may implement virtual network interface 26 as a tap interface, virtual user interface, or other type of interface.

The POD22 and the virtual router 21 exchange data packets using the virtual network interface 26. The virtual network interface 26 may be a DPDK interface. The POD22 and the virtual router 21 may use vhost to establish the virtual network interface 26.Pod 22 may operate according to an aggregation model. Pod22 may use a virtual device, such as a virtual device with a vhost user adapter, for user space container inter-process communication of virtual network interface 26.

For pod22, CNI 17 may configure virtual network interface 26 in conjunction with one or more of 22 shown in FIG. 1. Any container of pod22 can utilize (i.e., share) virtual network interface 26 of pod 22.

Virtual network interface 26 may represent a virtual ethernet ("veth") pair, where each end of the pair is a separate device (e.g., a Linux/Unix device), one end of the pair is assigned to pod22, and one end of the pair is assigned to virtual router 21. The veth pair or the end of the veth pair is sometimes referred to as a "port". The virtual network interface may represent a Media Access Control (MAC) LAN network having MAC addresses assigned to pod22 and virtual router 21 for communication between the pods 22 and the containers of virtual router 21. The virtual network interface may also be referred to as, for example, a Virtual Machine Interface (VMI), pod interface, container network interface, tap interface, veth interface, or simple network interface (in a particular context).

In the example server 12A of fig. 1, pod 22 is a virtual network endpoint in one or more virtual networks. Orchestrator 23 may store or otherwise manage configuration data for application deployment that specifies the virtual network and that pod 22 (or one or more containers therein) is a virtual network endpoint of the virtual network. Orchestrator 23 may receive configuration data from, for example, a user, operator/administrator, or other computing system.

As part of the process of creating pod 22, orchestrator 23 requests network controller 24 to create a corresponding virtual network interface for one or more virtual networks (indicated in the configuration data). The Pod 22 may have a different virtual network interface for each virtual network to which the Pod 22 belongs. For example, virtual network interface 26 may be a virtual network interface for a particular virtual network. Additional virtual network interfaces (not shown) may be configured for other virtual networks.

The network controller 24 processes the request to generate interface configuration data for the virtual network interface of the pod 22. The interface configuration data may include a container or container unique identifier, a list or other data structure specifying network configuration data for configuring the virtual network interfaces for each virtual network interface. The network configuration data of the virtual network interface may include a network name, an assigned virtual network address, a MAC address, and/or a domain name server value. The following is an example of interface configuration data in javascript object notation (JSON) format.

The network controller 24 transmits the interface configuration data to the server 12A, more specifically, in some cases, to the virtual router 21. To configure the virtual network interface of pod 22, orchestrator 23 may invoke CNI 17. The CNI 17 acquires interface configuration data from the virtual router 21 and processes it. The CNI 17 creates each virtual network interface specified in the interface configuration data. For example, CNI 17 can connect one end of a veth pair implementing management interface 26 to virtual router 21, and can connect the other end of the same veth pair to pod 22, which pod 22 can implement using a virtual user.

The following is example interface configuration data for the box 22 of the virtual network interface 26.

Conventional CNI plugins are invoked by the container platform/runtime, receive Add commands from the container platform to Add the container to a single virtual network, and may then invoke such plugins to receive Del (ete) commands from the container/runtime and remove the container from the virtual network. The term "call" may refer to an instantiation of a software component or module in memory executed by processing circuitry as executable code.

Network controller 24 is a cloud-native distributed network controller for Software Defined Networking (SDN) implemented using one or more configuration nodes 30 and one or more control nodes 32 and one or more telemetry nodes 60. Each configuration node 30 itself may be implemented using one or more cloud native component microservices. Each control node 32 itself may be implemented using one or more cloud native component microservices. Each telemetry node 60 itself may also be implemented using one or more cloud-native component microservices.

In some examples, configuration node 30 may be implemented by extending a native orchestration platform to support custom resources for a software-defined networked orchestration platform, and more specifically, to configure an underlying network of connection servers 12 by, for example, configuring virtual network interfaces for virtual execution elements, configuring overlay routing functions including overlay tunnels for virtual networks and overlay trees for multicast layer 2 and layer 3 to provide a northbound interface to the orchestration platform to support intent driven/declarative creation and management of virtual networks.

As part of the SDN architecture shown in fig. 1, network controller 24 may be multi-tenant aware and support multi-tenants for orchestration platforms. For example, network controller 24 may support a Kubernetes role-based access control (RBAC) architecture, local Identity Access Management (IAM), and external IAM integration. The network controller 24 may also support Kubernetes defined networking fabric and high-cascading networking features such as virtual networking, BGPaaS, networking policies, service chaining, and other telecommunications features. The network controller 24 may use a virtual network architecture to support network isolation and support layer 3 networking.

To interconnect multiple virtual networks, network controller 24 may use (and be configured in the underlying and/or virtual router 21) import and export policies defined using Virtual Network Router (VNR) resources. The virtual network router resources are operable to define connectivity between virtual networks by configuring the importation and exportation of routing information between respective routing instances for the virtual networks in the SDN architecture. A single network controller 24 may support multiple Kubernetes clusters, thus VNR allows multiple virtual networks in a namespace to be connected, virtual networks in different namespaces, kubernetes clusters, and cross-Kubernetes clusters. The VNR may also be extended to support virtual network connections across multiple instances of the network controller 24. Alternatively, a VNR may be referred to herein as a Virtual Network Policy (VNP) or a virtual network topology. As shown in the example of fig. 1, network controller 24 may maintain configuration data (e.g., configuration 30) representing a virtual network ("VN") that represents policies and other configuration data for establishing a VN within data center 10 through a physical underlying network and/or virtual router (e.g., virtual router 21 ("vruter 21")).

A user, such as an administrator, may interact with the UI 50 of the network controller 24 to define a VN. In some cases, UI 50 represents a Graphical User Interface (GUI) that facilitates entry of configuration data defining a VN. In other examples, UI 50 may represent a Command Line Interface (CLI) or other type of interface. Assuming UI 50 represents a graphical user interface, an administrator may define a VN by arranging graphical elements representing different pods, such as pod 22, to associate the pods with the VN, with any of the VNs enabling communication between one or more pods assigned to the VN.

In this regard, the administrator may understand Kubernetes or other orchestration platform, but may not fully understand the underlying infrastructure supporting the VN. Some controller architectures, such as Contrail, may configure a VN based on a network protocol that is similar, if not substantially similar, to the routing protocol in a conventional physical network. For example, contrail may utilize concepts from Border Gateway Protocol (BGP), which is a routing protocol used to communicate routing information within, and sometimes between, so-called Autonomous Systems (ASs).

Different versions of BGP exist, such AS Internal BGP (iBGP) for transporting routing information within the ases and external BGP (eBGP) for transporting routing information between ases. ASE may relate to the concept of items within Contrail, which is also similar to the namespaces in Kubernetes. In each instance of an AS, project, and namespace, an AS like the project and namespace may represent a set of one or more networks (e.g., one or more of VNs) that may share routing information and thereby facilitate interconnectivity between networks (or, in this instance, VNs).

To facilitate management of VNs, pod (or clusters), other physical and/or virtual components, etc., network controller 24 may provide telemetry node 60, which telemetry node 60 interfaces with various Telemetry Exporters (TEs) deployed within SDN architecture 8, such as TE 61 deployed at virtual router 21. Although shown as including a single TE 62, the network controller 24 may deploy TEs throughout the SDN architecture 8, such as at various servers 12 (such as shown in the example of fig. 1, where the TE 61 is deployed within the virtual router 21), TOR switches 16, rack switches 18, orchestrators 23, and so forth.

TE, including TE 61, may obtain different forms of metrology data. For example, the TE may obtain a system log (e.g., system log messages regarding information and debug conditions) and an object log (e.g., object log messages representing records of changes made to system objects (such as VMs, VNs, service instances, virtual routers, BGP peers, routing instances, etc.). TE may also obtain trace messages defining activity records collected locally by the software component and sent to the analysis node (possibly on demand only), statistics related to flow, CPU and memory usage, etc., and metrics defined as time series data of tagged keys, value pairs.

The TE may output all such metric data back to telemetry node 60 for viewing via, for example, UI 50, where the metric data is shown as MD 64A-64N ("MD 64"). An administrator or other network operator/user may view MD 64 to better understand and manage the operation of virtual and/or physical components of SDN architecture 8, perform troubleshooting and/or debugging of virtual and/or physical components of SDN architecture 8, and the like.

Considering the complexity of SDN architecture 8 in terms of various abstractions in terms of physical underlying networks, virtual overlay networks, virtual routers, etc., a large number of MDs 64 may be provided to facilitate a better understanding of how SDN architecture 8 operates. In some aspects, such MD 64 may enable a network operator (or network administrator in other operations) to understand how the network operates. While this MD 64 is valuable for addressing failures of network operations and gaining insight into the operation of SDN architecture 8, a significant amount of network resources may be required in terms of collecting and transmitting (or in other words, source) pod requirements of MD 64, which may consume significant amounts of network resources to deliver MD 64 from TE to telemetry node 60, consuming underlying hardware resources (e.g., processor cycles, memory bus bandwidth, etc., and related power of server 12 for executing TE) to collect MD 64.

In accordance with aspects of the techniques described in this disclosure, telemetry nodes 60 may provide efficient collection and aggregation of MD 64 in SDN architecture 8. As described above, network controller 24 may implement telemetry node 60 configured to provide an abstraction called a metric set (MG, shown as MG62A-62N- "MG 62") that facilitates low granularity and high granularity in allowing collection of only a subset of MD 64. Rather than indifferently collecting all metric data and outputting all possible metric data, telemetry node 60 may define one or more MGs 62, each MG62 may define a subset of all possible metric data (in which case it refers to a non-zero subset, rather than a mathematical abstraction, where a subset may include a zero set or more, including all metrics).

Telemetry node 60 may provide an Application Programming Interface (API) server through which requests to define MG62 are received, MG62 may be independently enabled or disabled. In other words, each MG62 operates at a low level of granularity to enable or disable a respective subset of the metric data. Within each MG62, the API server may also receive requests to enable or disable a separate set of metric data (meaning, for a particular metric) in the subset of metric data defined by each MG 62. Although described as enabling or disabling individual metrics data for a particular metric, in some examples, the API server may only enable or disable a set of metrics (corresponding to a particular non-zero subset of all available metrics). The network operator may then interface with telemetry node 60, e.g., via UI 50, to select one or more MGs 62 to enable or disable corresponding subsets of metric data defined by MGs 62, wherein such MGs 62 may be arranged (possibly hierarchically) according to various topics (e.g., border gateway protocol-BGP, internet protocol version 4-IPv4, IPv6, virtual router traffic, multiprotocol label switching virtual private network-MVPN, etc.).

Telemetry node 60 may define MGs 62 as custom resources within the container orchestration platform, converting each of MGs 62 into a configuration map defining (e.g., as an array) enabled metrics (while also potentially removing overlapping metrics to prevent redundant collection by MD 64). Telemetry node 60 may then interface with the identified telemetry exporter (e.g., TE 61) to configure TE 61 based on telemetry exporter configuration data to collect and export only metrics that are enabled to be collected.

In operation, telemetry node 60 may process a request (e.g., received from a network administrator via UI 50) by which to enable one of MGs 62 to derive from the defined one or more logically related elements, the MG 62 defining a subset of one or more metrics from a plurality of different metrics. Also, the term subset is not used herein in a strict mathematical sense, where the subset may include zero up to all possible elements. Conversely, the term subset is used to refer to one or more elements of less than all possible elements. MG 62 may be predefined in the sense that MG 62 is potentially hierarchically organized by topic to limit collection and export of MD 64 according to defined topics (such as those listed above) that may be relevant to a particular SDN architecture or use case. A manufacturer or other low-level developer of network controller 24 may create MG 62, and a network administrator may enable or disable MG 62 via UI 50 (and may customize by enabling and disabling various metrics within a given one of MG 62).

Telemetry node 60 may convert the subset of one or more metrics to Telemetry Exporter Configuration Data (TECD) 63 based on the request to enable the set of metrics, which configures the telemetry exporters deployed at one or more logically related elements (e.g., TE 61 deployed at server 12A) to export the subset of one or more metrics. TECD 62 may represent configuration data specific to TE 61, which may vary across different servers 12 and other underlying physical resources, as such physical resources may have a variety of different TEs deployed throughout SDN architecture 8. The request may identify a particular set of logically related elements (which may be referred to as a cluster conforming to a containerized application platform, such as a Kubernetes cluster), allowing telemetry node 60 to identify the type of TE 61 and generate a customized TECD 63 for that particular type 61.

Since the request may identify the cluster and/or pod to which to direct TECD 63, telemetry node 60 may interface with TE 61 (in this example) via the vruter 21 associated with the cluster to configure TE 61 based on TECD 63 to derive a subset of one or more metrics defined by the enabled one MG 62. In this regard, TE 61 may receive TECD 61 and collect MD 64 corresponding to only a subset of one or more metrics defined by the enabled ones of MGs 62 based on TECD 63. TE 61 may output metric data to telemetry node 60 corresponding to only a subset of the one or more metrics defined by enablement of MG 62.

Telemetry node 60 may receive MD64 for a particular TE, such as MD64A from TE 61, and store MD64A to a dedicated telemetry database (not shown in fig. 1 for ease of illustration). MD64A may represent a time series of key-value pairs representing a defined subset of one or more metrics over time, with the metric name (and/or identifier) as a key for the corresponding value. The network administrator may then interface with telemetry node 60 via UI 50 to check MD 64A.

In this way, the techniques may improve operation of SDN architecture 8 by reducing resource consumption when collecting and exporting MD 64. Assuming that not all metric data is collected and derived, but only a selected subset, TE 61 may use less processor cycles, memory bandwidth, and associated power to collect MD64 (less than all metrics) associated with the metric subset. Further, TE 61 may output only MD64 representing a subset of metrics, which results in less consumption of network bandwidth of SDN architecture 8, including processing resources, memory bandwidth and associated power that process metric data (which may also be referred to as telemetry data) within SDN architecture 8. Furthermore, telemetry nodes 60 receiving derived MD64 may process derived MD64 with less computational resources (again, processor cycles, memory bandwidth, and associated power), again assuming such MD64 corresponds only to enabled MG 62.

Further, by defining MG 64 using custom resources that facilitate extraction of underlying configuration data (e.g., TECD 63) to define a subset of metrics for each classified and/or locally arranged MG 62, a network administrator may more easily interface with telemetry nodes to customize the collection of MDs 64. Since these network administrators may not have extensive experience with the container orchestration platform, such abstractions provided by MG 62 may facilitate a more intuitive user interface with which to interact to customize the output of MD64, which may result in fewer network administrator errors that would otherwise consume computing resources (such as those listed above).

Fig. 2 is another view and a more detailed block diagram illustrating components of SDN architecture 200 in accordance with the techniques of this disclosure. Configuration node 230, control node 232, user interface 244, and telemetry node 260 are shown as having their respective component microservices for implementing network controller 24 and SDN architecture 8 as a cloud native SDN architecture in this example. Each component microservice is deployed to a compute node.

Fig. 2 shows a single cluster divided into network controller 24, user interface 244, calculator (server 12) and telemetry node 260 features. Together, configuration node 230 and control node 232 form network controller 24, although network controller 24 may also include user interface 350 and telemetry node 260, as shown above in the example of fig. 1.

Configuration node 230 may include a component micro-service API server 300 (or "Kubernetes API server 300" -corresponding controller 406 not shown in fig. 3), a custom API server 301, a custom resource controller 302, and an SDN controller manager 303 (sometimes referred to as a "kube-manager" or "SDN kube-manager", where the coordination platform of network controller 24 is Kubernetes). The Contrail-kube-manager is one example of SDN controller manager 303. The configuration node 230 extends the interface of the API server 300 with the custom API server 301 to form an aggregation layer supporting the data model of the SDN architecture 200. The intent of the SDN architecture 200 may be a custom resource.

The control node 232 may include a component micro service control 320 and a core DNS 322. The controller 320 performs configuration assignment, and route learning and assignment.

The compute nodes are represented by servers 12. Each computing node includes a virtual router agent 316, a virtual router forwarding component (vruter) 318, and possibly a Telemetry Exporter (TE) 261. One or more or all of virtual router agent 316, vruter 318, and TE 261 may be component microservices that logically form a virtual router (e.g., virtual router 21 shown in the example of fig. 1). In general, virtual router agent 316 performs control-related functions. The virtual router agent 316 receives configuration data from the control node 232 and converts the configuration data into forwarding information for the vruter 318.

The virtual router agent 316 may also perform firewall rules processing, establish flows for the vruter 318, and interface with orchestration plug-ins (CNI for Kubernetes and Nova plug-ins for Openstack). Virtual router agent 316 generates routes when a workload (Pod or VM) is generated on a computing node, and virtual router 316 exchanges such routes with control nodes 232 for distribution to other computing nodes (control nodes 232 use BGP to distribute routes between control nodes 232). The virtual router agent 316 also drops the route at the termination of the workload. The vruter 318 may support one or more forwarding modes, such as kernel mode, DPDK, smart off-loading, etc. In some examples of container architecture or virtual machine workload, the compute nodes may be Kubernetes workers/inodes or Openstack new compute nodes, depending on the particular orchestrator used. TE 261 may represent an example of TE 61 shown in the example of fig. 1, TE 61 configured to interface with server 12A, vRouter 318 and possibly virtual router agent 316 to collect metrics configured by TECD 63, as described in more detail above.

One or more optional telemetry nodes 260 provide metrics, alarms, logging, and business analysis. SDN architecture 200 telemetry utilizes cloud native monitoring services such as Prometheus, elastic, fluentd, kinaba stack (EFK) (and/or, in some examples, opensearch and Opensearch-dashboards) and InfluxTSDB. The SDN architecture component microservices of the configuration node 230, the control node 232, the compute nodes, the user interface 244, and the analysis node (not shown) may generate telemetry data. The telemetry data may be used by the services of telemetry node 260. Telemetry node 260 may expose REST endpoints to users and may support insight and event correlation.

The optional user interface 244 includes a web User Interface (UI) 306 and UI backend 308 services. In general, the user interface 244 provides configuration, monitoring, visualization, security, and troubleshooting for SDN architecture components.

Each of telemetry 260, user interface 244, configuration node 230, control node 232, and server 12/compute nodes may be considered SDN architecture 200 nodes because each of these nodes is an entity that implements the functionality of the configuration, control, or data plane or UI and telemetry nodes. Node scaling is configured during "setup" and SDN architecture 200 supports automatic expansion of SDN architecture 200 nodes using a coordinating system operator such as a Kubernetes operator.

In the example of fig. 2, telemetry node 260 includes API server 262, collector 274, and timing database (TSDB) 276. Through a user interface, such as web user interface 306, API server 262 may receive requests to enable and/or disable one or more MGs 62. MG62 may be defined using another markup language (YAML) and may be preconfigured as described above. A partial list of MG62 using the YAML definition is provided below.

/>

In each instance of the example MG62 listed above, there is a header defining apiVersion, metadata indicating that the YAML definition is a type for a set of metrics, for a name (e.g., controller peer), a specification ("spec") indicating that export is true, a type of metrics indicating the type of metrics collected (which are for the network controllers in the example YAML definition listed directly above), and a list of individual metrics to export. API server 272 may then receive a request to enable output of one or more MGs 62, which may be selected by a network administrator via web UI 306, resulting in the request to enable one or more MGs 62 being sent to telemetry node 260 via API server 272. As described above, SDN architecture configuration intent may be a custom resource including telemetry configuration requests to enable and/or disable MG 62.

The request may configure telemetry node 260 to enable and/or disable one or more MGs 62 by setting the export specification to true. By default, all of the MGs 62 may initially be enabled. Furthermore, although not explicitly shown in the above example of MG 62 using YAML definition, each metric may include a metric-specific derivation that allows derivation for each metric only in a given one of MG 62. Once export is enabled for one or more MGs 62, API server 272 may interface with collector 274 to generate TECD 63.TECD 63 may represent a configuration map of a flat list containing metrics.

Collector 274 may remove any redundant (or in other words, duplicate) metrics that may exist in two or more enabled MGs 62 when generating TECDs 63, which results in TECDs 62 defining only a single metric for collection and derivation, rather than configuring TE 261 to collect and derive two or more instances of the same metric. That is, for example, when the subset of metrics defined by MG 62A overlaps with the subset of metrics defined by MG 62N, collector 274 may remove at least one overlapping metric from the subset of metrics defined by MG 62N to generate TECD 63.

Collector 274 can determine where to send TECD 63 based on the cluster name as described above, selecting the TE associated with that cluster, in this case assuming that the cluster is TE261. Collector 274 may interface with TE261, providing TECD 63 to TE261.TE 261 may receive TECD 261 and configure various exporter agents (not shown in the example of fig. 2) to collect a subset of metrics defined by MG 62 enabled in MG 62. The agents may periodically (e.g., every 30 seconds) collect the identified subset of metrics, reporting the metrics back to TE261.TE 261 may output back a subset of metrics as a key value pair in response to receiving the subset of metrics, where the key identifies the metrics and the value containing MD 64.

Collector 274 may receive MD 64 and store MD 64 to TSDB 276. As one example, TSDB 276 may represent a promethaus server that facilitates efficient storage of time series data. Collector 274 may continue to collect MD 64 in this periodic manner. As described above, if all MGs 62 are enabled, MD 64 may grow rapidly, which may place significant stress on the network and underlying physical resources. Allowing only the output of the MGs 62 to be selected may reduce this strain on the network, particularly when only one or two MGs 62 may be needed for any given use case.

Although telemetry node 260 is shown as a separate node from configuration node 230, telemetry node 260 may be implemented as a separate operator using various custom resources, including metric set custom resources. Telemetry node 260 may act as a client of a container orchestration platform (e.g., kubernetes API) that acts as a controller for one or more custom resources (which may also include the metric set custom resources described throughout this disclosure), such as one of custom resource controllers 302 of configuration node 230. In this sense, API server 272 of telemetry node 260 may extend custom API server 301 (or form part of custom API server 301). As a custom controller, telemetry node 260 may perform the coordination shown in the example of fig. 6, including a mediator similar to mediator 816 for adjusting the current state to a desired state, which in the context of a metric set involves configuring TE 261 to collect and output metric data from the metric set.

Fig. 4 is a block diagram illustrating example components of an SDN architecture in accordance with the techniques of this disclosure. In this example, SDN architecture 400 extends and uses Kubernetes API server for network configuration objects that implement user intent for network configuration. In Kubernetes terminology, such configuration objects are referred to as custom resources and are simply referred to as objects when persisted in the SDN architecture. The configuration object is primarily a user intent (e.g., virtual network, BGPaaS, network policy, service chain, etc.).

SDN architecture 400 configuration node 230 may use a Kubernetes API server for configuring objects. In Kubernetes terminology, these are referred to as custom resources.

Kubernetes provides two ways to add custom resources to a cluster:

custom Resource Definition (CRD) is simple and can be created without any programming.

API aggregation requires programming, but allows more control over API behavior, such as how transitions between data and API versions are stored.

The aggregate API is a slave API server that is located behind a master API server that acts as a proxy. This arrangement is known as API Aggregation (AA). For the user, simply show that KubernetesAPI is extended. The CRD allows the user to create a new type of resource without adding another API server, such as adding MG 62. Regardless of how they are installed, the new resources are called Custom Resources (CRs) to distinguish them from local Kubernetes resources (e.g., pod). The CRD is used to initially configure the prototype. The architecture may use an API server builder library to implement the aggregate API. An API server builder is a collection of libraries and tools for building a native Kubernetes syndication extension.

Typically, each resource in the Kubernetes API requires code to process REST requests and manage persistent storage of objects. The primary Kubernetes API server 300 (implemented with API server micro servers 300A-300J) handles local resources and may also handle custom resources generally through CRD. Aggregation API 402 represents an aggregation layer of extended Kubernetes API server 300 to allow for the provision of dedicated implementations of custom resources by writing and deploying custom API server 301 (using custom API server micro-services 301A-301M). The main API server 300 delegates requests for custom resources to the custom API server 301, making these resources available to all its clients.

In this way, API server 300 (e.g., kube-apiserver) receives the Kubernetes configuration object, the native object (pod, service), and the custom resource. Custom resources of SDN architecture 400 may include configuration objects that implement the desired network configuration of SDN architecture 400 when the desired state of the configuration objects in SDN architecture 400 is implemented, including implementing each VNR 52 as one or more import policies and/or one or more export policies and a common routing target (and routing instance). As described above, implementing MG 62 within SDN architecture 400 may result in enabling and disabling collection and export of individual metrics by TE 261.

In this regard, the custom resources may correspond to configuration modes traditionally defined for network configuration, but in accordance with the techniques of this disclosure, the custom resources are extended to be operable through the aggregation API 402. Such custom resources are alternatively referred to herein and as "custom resources for SDN architecture configuration. These may include VNs, bgp-as-a-service (BGPaaS), subnets, virtual routers, service instances, items, physical interfaces, logical interfaces, nodes, network IPAMs, floating IPs, alarms, alias IPs, access control lists, firewall policies, firewall rules, network policies, routing targets, routing instances. Custom resources for SDN architecture configuration may correspond to configuration objects traditionally exposed by SDN controllers, but in accordance with the techniques described herein, configuration objects are exposed as custom resources and are consolidated with Kubernetes local/built-in resources to support a unified intent model exposed by aggregation API 402, which is implemented by Kubernetes controllers 406A-406N and custom resource controller 302 (shown in fig. 3 as having component micro-services 302A-302L), custom resource controller 302 to coordinate the actual state and expected state of a computing infrastructure comprising network elements.

Given the unified nature of exposing custom resources that are consolidated with Kubernetes native/built-in resources, a Kubernetes administrator (or other Kubernetes user) may define MG 62 using common Kubernetes semantics that may then be converted into complex policies detailing the importation and exportation of MD 64 without requiring an understanding of how telemetry node 260 and telemetry exporter 261 operate to collect and export MD 64, if any. In this way, aspects of the techniques may facilitate a more uniform user experience that may result in fewer misconfigurations and trial-and-error, which may improve execution of SDN architecture 400 itself (in terms of utilizing fewer processing cycles, memory, bandwidth, etc., and associated power).

The aggregation layer of the API server 300 sends API custom resources to their corresponding registered custom API servers 300. There may be multiple custom API servers/custom resource controllers to support different kinds of custom resources. Custom API server 300 processes custom resources for SDN architecture configuration and writes to configuration store(s) 304, which may be equalized. Custom API server 300 may be the host and expose SDN controller identifier assignment services that may be needed by custom resource controller 302.

Custom resource controller(s) 302 begin applying business logic to reach a user intent with a user intent configuration. Business logic is implemented as a mediation ring. Fig. 6 is a block diagram illustrating an example of a custom controller for custom resources of an SDN architecture configuration in accordance with the techniques of the present invention. Custom controller 814 may represent an example instance of custom resource controller 301. In the example shown in FIG. 6, customization controller 814 may be associated with customization resource 818. Custom resources 818 may be any custom resources for SDN architecture configuration. Custom controller 814 may include a mediator 816, where mediator 816 includes logic to perform a reconciliation cycle, where custom controller 814 observes 834 (e.g., monitors) the current state 832 of custom resource 818. In response to determining that desired state 836 does not match current state 832, mediator 816 may perform actions to adjust 838 the state of the customized resource such that current state 832 matches desired state 836. API server 300 may receive the request and relay it to custom API server 301 to change current state 832 of custom resource 818 to desired state 836.

In the case where API request 301 is a creation request for a custom resource, mediator 816 may act on the creation event of instance data for the custom resource. The mediator 816 may create instance data for the custom resources on which the requested custom resource depends. As an example, the edge node custom resources may depend on virtual network custom resources, virtual interface custom resources, and IP address custom resources. In this example, when the mediator 816 receives a creation event on the edge node custom resource, the mediator 816 may also create custom resources on which the edge node custom resource depends, such as virtual network custom resources, virtual interface custom resources, and IP address custom resources.

By default, custom resource controller 302 runs an active-passive mode and uses a master selection to achieve consistency. When the controller pod begins, it attempts to create a ConfigMap resource in Kubernetes using the specified key. If the creation is successful, the pod becomes the master and begins processing the reconciliation request; otherwise, it prevents trying to create a ConfigMap in the loop.

The configuration plane implemented by the configuration node 230 has high availability. Configuration node 230 may be based on Kubernetes, including kube-apiserver services (e.g., API server 300), and storage backend, etc. (e.g., configuration store 304). Effectively, the aggregation API402 implemented by the configuration node 230 acts as a front end for the control plane implemented by the control node 232. The primary implementation of the API server 300 is kube-apiserver, which is designed to scale horizontally by deploying more instances. As shown, several instances of API server 300 may be run to load balance API requests and processing.

Configuration memory 304 may be implemented as such. ETCD is a consistent and high availability key value store that is used as Kubernetes backing store for cluster data.

In the example of fig. 4, servers 12 of SDN architecture 400 each include orchestration agent 420 and a containerized (or "cloud-native") routing protocol daemon 324. These components of SDN architecture 400 are described in further detail below.

SDN controller manager 303 may operate as an interface between Kubernetes core resources (services, namespaces, pod, network policies, network attachment definitions) and extended SDN architecture resources (virtual networks, routing instances, etc.). SDN controller manager 303 observes the changes of the Kubernetest API on the Kubernetes core and custom resources for SDN architecture configuration, as a result, CRUD operations may be performed on the relevant resources.

In some examples, SDN controller manager 303 is a set of one or more Kubernetes custom controllers. In some examples, in a single cluster or multi-cluster deployment, SDN controller manager 303 may run on the Kubernetes cluster it manages.

SDN controller manager 303 listens for the following Kubernetes objects for creating, deleting and updating events:

·Pod

service

·NodePort

Inlet port

Endpoint. Endpoint

Name space

Expansion of

Network policy

When these events are generated, SDN controller manager 303 creates the appropriate SDN architecture object, which in turn is defined as a custom resource for SDN architecture configuration. In response to detecting an event on an instance of a custom resource, whether instantiated by SDN controller manager 303 and/or by custom API server 301, control node 232 obtains configuration data for the instance of the custom resource and configures a corresponding instance of a configuration object in SDN architecture 400.

For example, SDN controller manager 303 monitors Pod creation events and in response may create the following SDN architecture objects: virtual machines (workload/pod), virtual machine interfaces (virtual network interfaces), and InstanceIp (IP address). In this case, control node 232 may then instantiate the SDN architecture object in the selected compute node.

As an example, based on the observations, control node 232A may detect an event on an instance of a first customized resource exposed by customer API server 301A, where the first customized resource is used to configure certain aspects of SDN architecture system 400 and corresponds to a type of configuration object of SDN architecture system 400. For example, the type of configuration object may be a firewall rule corresponding to the first customized resource. In response to this event, control node 232A may obtain configuration data for the firewall rule instance (e.g., firewall rule specification) and provide the firewall rules in the virtual router of server 12A. Configuration node 230 and control node 232 may perform similar operations on other customized resources having corresponding types of configuration objects for the SDN architecture, such as virtual networks, virtual network routers, bgp-as-a-service (BGPaaS), subnets, virtual routers, service instances, projects, physical interfaces, logical interfaces, nodes, network IPAMs, floating IPs, alarms, alias IPs, access control lists, firewall policies, firewall rules, network policies, routing targets, routing instances, and the like.

FIG. 4 is a block diagram of an example computing device in accordance with the techniques described in this disclosure. The computing device 500 of fig. 4 may represent a real or virtual server, and may represent an example instance of any server 12, and may be referred to as a computing node, master/slave node, or host. In this example, computing device 500 includes a bus 542 that couples hardware components of the hardware environment of computing device 500. Bus 542 couples a Network Interface Card (NIC) 530, a memory disk 546, and one or more microprocessors 210 (hereinafter "microprocessor 510"). NIC 530 may have SR-IOV capability. In some cases, a front side bus may couple microprocessor 510 and memory device 524. In some examples, bus 542 may couple memory device 524, microprocessor 510, and NIC 530. Bus 542 may represent a Peripheral Component Interface (PCI) express (PCIe) bus. In some examples, a Direct Memory Access (DMA) controller may control DMA transfers between components coupled to bus 542. In some examples, components coupled to bus 542 control DMA transfers between components coupled to bus 542.

Microprocessor 510 may include one or more processors, each including a separate execution unit to execute instructions conforming to an instruction set architecture, the instructions stored in a storage medium. The execution units may be implemented as separate Integrated Circuits (ICs) or may be combined within one or more multi-core processors (or "multi-core" processors), each implemented using a single IC (i.e., a chip multiprocessor).

Disk 546 represents computer-readable storage media including volatile and/or nonvolatile, removable and/or non-removable media implemented in any method or technology for storage of information such as processor-readable instructions, data structures, program modules or other data. Computer-readable storage media includes, but is not limited to, random Access Memory (RAM), read Only Memory (ROM), EEPROM, flash memory, CD-ROM, digital Versatile Disks (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by microprocessor 510.

Main memory 524 includes one or more computer-readable storage media, which may include Random Access Memory (RAM), such as Dynamic RAM (DRAM) in various forms, e.g., DDR2/DDR3 SDRAM, or Static RAM (SRAM), flash memory, or any other form of fixed or removable storage medium that can be used to carry or store desired program code and program data in the form of instructions or data structures and that can be accessed by a computer. Main memory 524 provides a physical address space comprised of addressable memory locations.

A Network Interface Card (NIC) 530 includes one or more interfaces 532 configured to exchange packets using links of an underlying physical network. Interface 532 may include a port interface card having one or more network ports. NIC 530 may also include, for example, on-card memory to store packet data. Direct memory access transmissions between NIC 530 and other devices coupled to bus 542 may be read from and written to NIC memory.

Memory 524, NIC 530, storage disk 546, and microprocessor 510 may provide an operating environment for a software stack that includes an operating system kernel 580 executing in kernel space. Kernel 580 may represent, for example, linux, berkeley Software Distribution (BSD), another Unix-variant kernel, or a Windows server operating system kernel available from microsoft corporation. In some instances, an operating system may execute a hypervisor and one or more virtual machines managed by the hypervisor. Example hypervisors include kernel-based virtual machines (KVM) for Linux kernels, xen available from VMware, ESXi, windows Hyper-V available from Microsoft, and other open source and proprietary hypervisors. The term hypervisor may include a Virtual Machine Manager (VMM). An operating system including kernel 580 provides an execution environment for one or more processes in user space 545.

Kernel 580 includes physical driver 525 that uses network interface card 530. The network interface card 530 may also implement an SR-IOV to enable sharing of physical network functions (I/O) among one or more virtual execution elements, such as container 529A or one or more virtual machines (not shown in fig. 4). A shared virtual device, such as a virtual function, may provide dedicated resources such that each virtual execution element may access the dedicated resources of NIC 530, and thus appear to each virtual execution element as a dedicated NIC. Virtual functions may represent lightweight PCIe functions that share physical resources with physical functions used by physical driver 525 and with other virtual functions. For an SR-IOV capable NIC 530, the NIC 530 may have thousands of virtual functions available according to the SR-IOV standard, but for I/O intensive applications, the number of virtual functions configured is typically much smaller.

The computing device 500 may be coupled to a physical network switch fabric that includes an overlay network that extends the switch fabric from a physical switch to software or "virtual" routers, including virtual router 506, that are coupled to physical servers of the switch fabric. The virtual router may be a process or thread or component thereof executed by a physical server (e.g., server 12 of fig. 1) that dynamically creates and manages one or more virtual networks that are available for communication between virtual network endpoints. In one example, the virtual router implements each virtual network using an overlay network that provides the ability to decouple the virtual address of an endpoint from the physical address (e.g., IP address) of the server on which the endpoint is executing.

Each virtual network may use its own addressing and security scheme and may be considered orthogonal to the physical network and its addressing scheme. Various techniques may be used to transport packets within and across virtual networks through a physical network. The term "virtual router" as used herein may include an Open VSwitch (OVS), OVS bridge, linux bridge, docker bridge, or other device and/or software located on a host device and performing switching, bridging, or routing of packets between virtual network endpoints of one or more virtual networks, where the virtual network endpoints are hosted by one or more servers 12. In the example computing device 500 of fig. 4, the virtual router 506 is executed within the user space as a DPDK-based virtual router, but in various implementations the virtual router 506 may be executed within a hypervisor, host operating system, host application, or virtual machine.

The virtual router 506 may replace and contain virtual routing/bridging functions of the Kubernetes deployed Linux bridge/OVS module typically used for the pod 502. Virtual router 506 may perform bridging (e.g., E-VPN) and routing (e.g., L3VPN, IP-VPN) for virtual networks. Virtual router 506 may perform network services such as application security policies, NAT, multicasting, mirroring, and load balancing.

The virtual router 506 may be implemented as a kernel module or as a user space DPDK procedure (the virtual router 506 is shown herein in user space 545). The virtual router agent 514 may also execute in user space. In the example computing device 500, the virtual router 506 executes within the user space as a DPDK-based virtual router, but in various implementations the virtual router 506 may execute within a hypervisor, host operating system, host application, or virtual machine. The virtual router agent 514 connects to the network controller 24 using a channel for downloading configuration and forwarding information. The virtual router agent 514 programs this forwarding state to the virtual router data (or "forwarding") plane represented by the virtual router 506. Virtual router 506 and virtual router agent 514 may be processes. Virtual router 506 and virtual router agent 514 are containerized/cloud-native.

Virtual router 506 may be multi-threaded and execute on one or more processor cores. Virtual router 506 may include multiple queues. Virtual router 506 may implement a packet processing pipeline. Depending on the operations to be applied to the packet, the virtual router agent 514 may stitch the pipeline from the simplest to the most complex. Virtual router 506 may maintain multiple forwarding base instances. Virtual router 506 may use RCU (read copy update) locks to access and update tables.

To send packets to other computing nodes or switches, virtual router 506 uses one or more physical interfaces 532. In general, virtual router 506 exchanges overlay packets with a workload, such as a VM or pod 502. Virtual router 506 has a plurality of virtual network interfaces (e.g., vifs). These interfaces may include a kernel interface vhost0 for exchanging packets with the host operating system; interface pkt0 with virtual router agent 514 to obtain forwarding state from the network controller and send exception packets. There may be one or more virtual network interfaces corresponding to one or more physical network interfaces 532. Other virtual network interfaces of virtual router 506 are used to exchange packets with the workload.

In a kernel-based deployment of virtual router 506 (not shown), virtual router 506 is installed as a kernel module within the operating system. Virtual router 506 registers itself with the TCP/IP stack to receive packets from any desired operating system interface it wants. The interfaces may be keys, physical, tap (for VM), veth (for container), etc. In this mode, virtual router 506 relies on the operating system to send and receive packets from different interfaces. For example, the operating system may expose a tap interface supported by the vhost-net driver to communicate with the VM. Once virtual router 506 registers packets from the tap interface, the TCP/IP stack sends all packets to it. Virtual router 506 sends the packet via the operating system interface. In addition, the NIC queues (physical or virtual) are handled by the operating system. Packet processing may operate in an interrupt mode, which generates interrupts and may result in frequent context switches. Overhead operators with frequent interrupts and context switches can overwhelm the operating system and result in poor performance when high packet rates exist.

In a DPDK based deployment of virtual router 506 (shown in fig. 5), virtual router 506 is installed as a user space 545 application linked to a DPDK library. This may result in faster performance than core-based deployments, especially in the presence of high packet rates. The physical interface 532 is used by the Polling Mode Driver (PMD) of the DPDK instead of the interrupt-based driver of the kernel. Registers of physical interface 532 can be exposed in user space 545 so as to be accessible by the PMD; the physical interfaces 532 bound in this way are no longer managed by or visible to the host operating system, and the DPDK-based virtual router 506 manages the physical interfaces 532. This includes packet polling, packet processing, and packet forwarding. In other words, the user packet processing step is performed by the virtual router 506DPDK data plane. This "polling mode" nature makes the virtual router 506 more efficient for DPDK data plane packet processing/forwarding than the interrupt mode when the packet rate is high. There are relatively few interrupts and context exchanges during packet I/O as compared to the kernel-mode virtual router 506, and in some cases interrupts and context exchanges during packet I/O may be avoided entirely.

In general, each of the pods 502A-502B can be assigned one or more virtual network addresses for use within a corresponding virtual network, where each of the virtual networks can be associated with a different virtual subnet provided by the virtual router 506. For example, pod502B may be assigned its own virtual layer 3 (L3) IP address for sending and receiving communications, but pod502B may not be aware of the IP address of the computing device 500 on which pod502B is executing. The virtual network address may thus be different from the logical address of the underlying physical computer system (e.g., computing device 500).

The computing device 500 includes a virtual router agent 514 that controls the coverage of the virtual network of the computing device 500 and coordinates the routing of data packets within the computing device 500. Generally, the virtual router agent 514 communicates with the network controller 24 for the virtualization infrastructure, which generates commands to create virtual networks and configure network virtualization endpoints, such as the computing device 500, more specifically, the virtual router 506, and the virtual network interface 212. By configuring virtual router 506 based on information received from network controller 24, virtual router agent 514 may support configuration network isolation, policy-based security, gateways, source Network Address Translation (SNAT), load balancers, and service chaining capabilities for coordination.

In one example, a network packet (e.g., a layer three (L3) IP packet or a layer two (L2) ethernet packet generated or consumed by containers 529A-529B within a virtual network domain) may be encapsulated in another packet (e.g., another IP or ethernet packet) transmitted by a physical network. Packets transmitted in a virtual network may be referred to herein as "inner packets" and physical network packets may be referred to herein as "outer packets" or "tunnel packets. Virtual router 506 may perform encapsulation and/or decapsulation of virtual network packets within physical network packets. This functionality is referred to herein as tunneling and may be used to create one or more overlay networks. In addition to ipineip, other exemplary tunneling protocols that may be used include IP encapsulation over Generic Routing (GRE), vxlan, multiprotocol label switching (MPLS) over GRE, MPLS over User Datagram Protocol (UDP), and the like. Virtual router 506 performs tunnel encapsulation/decapsulation of packets originating from/destined to any container of pod 502, and virtual router 506 exchanges packets with pod 502 via bus 542 and/or the bridge of NIC 530.

As described above, the network controller 24 may provide a logically centralized controller to facilitate the operation of one or more virtual networks. The network controller 24 may, for example, maintain a routing information base, such as one or more routing tables storing routing information for the physical network and one or more overlay networks. Virtual router 506 implements one or more virtual routing and forwarding instances (VRFs), such as VRF 222A, for respective virtual networks, wherein virtual router 506 operates as a respective tunnel endpoint. Typically, each VRF stores forwarding information for the corresponding virtual network and identifies where the data packet is to be forwarded and whether the packet is to be encapsulated in a tunneling protocol, e.g., with a tunneling header that may include one or more headers for different layers of the virtual network protocol stack. Each VRF may include a network forwarding table that stores routing and forwarding information for the virtual network.

NIC 530 may receive the tunnel packet. Virtual router 506 processes the tunnel packets to determine the virtual network of source and destination endpoints of the internal packets from the tunnel encapsulation header. Virtual router 506 may strip the layer 2 header and tunnel encapsulation header to internally forward only the inner packet. The tunnel encapsulation header may include a virtual network identifier, such as a vxlan label or MPLS label, that indicates the virtual network (e.g., the virtual network corresponding to VRF 222A). VRF 222A may include forwarding information for the internal packet. For example, VRF 222A may map the destination layer 3 address of the inner packet to virtual network interface 212. In response, VRF 222A forwards the internal packet to pod 502A via virtual network interface 212.

Container 529A may also have the internal packet as a source virtual network endpoint. For example, container 529A may generate a layer 3 internal packet to a destination virtual network endpoint, which layer 3 internal packet is executed by another computing device (i.e., not computing device 500) or for another container. Container 529A may send layer 3 internal packets to virtual router 506 via a virtual network interface attached to VRF 222A.

Virtual router 506 receives the inner packet and the layer 2 header and determines a virtual network for the inner packet. Virtual router 506 may use any of the virtual network interface implementation techniques described above (e.g., macvlan, veth, etc.) to determine the virtual network. Virtual router 506 uses VRF 222A corresponding to the virtual network for the inner packet to generate an outer header for the inner packet that includes an outer IP header for the overlay tunnel and a tunnel encapsulation header identifying the virtual network. The virtual router 506 encapsulates the inner packet with an outer header. The virtual router 506 may encapsulate the tunnel packet with a new layer 2 header having a destination layer 2 address associated with a device external to the computing device 500 (e.g., one of the TOR switches 16 or servers 12). If external to computing device 500, virtual router 506 outputs tunnel packets with new layer 2 headers to NIC 530 using physical function 221. NIC 530 outputs the packet on the outbound interface. If the destination is another virtual network endpoint executing on the computing device 500, the virtual router 506 routes the packet to the appropriate one of the virtual network interfaces 212, 213.

In some examples, a controller of computing device 500 (e.g., network controller 24 of fig. 1) configures a default route in each pod 502 to cause virtual machine 224 to use virtual router 506 as an initial next hop for an outbound packet. In some examples, NIC 530 is configured with one or more forwarding rules to cause all packets received from virtual machine 224 to be switched to virtual router 506.

pod 502A includes one or more application containers 529A. POD 502B includes an instance of a containerized routing protocol daemon (cRPD) 560. Container platform 588 includes container runtime 590, organization agent 592, service agent 593, and CNI 570.

The container engine 590 includes code executable by the microprocessor 510. The container runtime 590 can be one or more computer processes. The container engine 590 runs containerized applications in the form of containers 529A-529B. The container engine 590 may represent a dock, rk, or other container engine for managing containers. In general, the container engine 590 receives requests and manages objects such as images, containers, networks, and volumes. The image is a template with instructions for creating a container. A container is an executable instance of an image. Based on instructions from the controller agent 592, the container engine 590 can take images and instantiate them as executable containers in the pod 502A-502B.

The service agent 593 includes code executable by the microprocessor 510. The service agent 593 may be one or more computer processes. The service agent 593 monitors the addition and removal of services and endpoint objects, and it maintains the network configuration of the computing device 500 to ensure communications between containers, for example, using services. The service agent 593 can also manage iptables to capture traffic to the virtual IP address and port of the service and redirect the traffic to the agent port of the agent-supported pod. The service proxy 593 may represent a kube proxy for a small node of the Kubernetes cluster. In some examples, the container platform 588 does not include a service agent 593, or the service agent 593 is disabled to facilitate the configuration of the virtual router 506 and pod 502 by the CNI 570.

Orchestration agent 592 comprises code executable by microprocessor 510. The coordination agent 592 may be one or more computer processes. Orchestration agent 592 can represent kubrelet for the small nodes of the Kubernetes cluster. Orchestration agent 592 is an agent of an orchestrator (e.g., orchestrator 23 of fig. 1) that receives container specification data for containers and ensures that the containers are executed by computing device 500. The container specification data may be sent from orchestrator 23 to orchestration agent 592 in the form of a manifest file, or indirectly received via a command line interface, HTTP endpoint, or HTTP server. The container specification data may be a pod specification of one of the pods 502 of the container (e.g., a container specification-YAML (yet another markup language) or JSON object describing the container). Based on the container specification data, the organization agent 592 instructs the container engine 590 to obtain and instantiate a container image of the container 529 for execution of the container 529 by the computing device 500.

The coordination agent 592 instantiates or otherwise invokes the CNI 570 to configure one or more virtual network interfaces for each pod 502. For example, orchestration agent 592 receives container specification data for pod 502A and instructs container engine 590 to create pod 502A with container 529A based on the container specification data for pod 502A. Orchestration agent 592 also invokes CNI 570 to configure virtual network interfaces for pod 502A corresponding to the virtual networks of VRF 222A. In this example, pod 502A is a virtual network endpoint of a virtual network corresponding to VRF 222A.

CNI 570 may obtain interface configuration data for configuring virtual network interfaces for pod 502. The virtual router agent 514 acts as a virtual network control plane module for enabling the network controller 24 to configure the virtual router 506. Unlike the organization control plane (including container platform 588 for the inodes and master nodes, e.g., organizer 23) that manages provisioning, scheduling, and managing virtual execution elements, the virtual network control plane (including network controller 24 and virtual router agent 514 for the inodes) manages, in part, the configuration of the virtual network implemented in the data plane through the inode's virtual router 506. Virtual router agent 514 communicates interface configuration data for the virtual network interface to CNI 570 to enable orchestration control plane element (i.e., CNI 570) to configure the virtual network interface according to the configuration state determined by network controller 24, bridging the gap between the orchestration control plane and the virtual network control plane. Furthermore, this may enable CNI 570 to obtain interface configuration data for and configure multiple virtual network interfaces for the pod, which may reduce communication and resource overhead inherent in invoking a separate CNI 570 for configuring each virtual network interface.

The containerized routing protocol daemon is described in U.S. application Ser. No. 17/649,632, filed on 1/2/2022, the entire contents of which are incorporated herein by reference.

As further shown in the example of fig. 4, TE 561 may represent one example of TEs 61 and/or 261. Although not specifically shown in the example of fig. 4, virtual router 506, virtual router agent 514 and TE 561 may be implemented in separate pods similar to pods 502A and 502B, where such pods may generally represent an abstraction of virtual router 506, implementing a plurality of different containers (one for each of virtual router 506, virtual router agent 514, and TE 561). TE 561 may receive TECD 63 to configure the collection of individual agents of MD 64. As described above, TECD 63 may represent a list of metrics that are allowed to collect, which list has been translated from requests to enable individual MGs 62. These agents may periodically (although such collection may not be periodic) examine virtual router 506 and underlying physical resource MD 64 and then export it back to telemetry node 60/260.

Fig. 5A is a block diagram illustrating a control/routing plane for an underlying network and overlay network configuration using SDN architecture in accordance with the techniques of the present invention. Fig. 5B is a block diagram of a virtual network showing a configuration of connecting pod using a tunnel configured in an underlying network in accordance with the techniques of the present disclosure.

The network controller 24 for the SDN architecture may use a distributed or centralized routing plane architecture. The SDN architecture may use a containerized routing protocol daemon (process).

From the perspective of network signaling, the routing plane may operate according to a distributed model, where cRPD runs on each compute node in the cluster. This essentially means that intelligence is built into the compute nodes and involves complex configurations at each node. The Route Reflectors (RR) in this model may not make intelligent routing decisions, but rather act as relays to reflect the routing between nodes. Distributed container routing protocol daemon (cRPD) is a routing protocol process that may be used in which each computing node runs its own routing daemon instance. Meanwhile, the centralized cRPD master instance may act as an RR to relay routing information between computing nodes. The routing and configuration intelligence is distributed over nodes with RRs in the central location.

Alternatively, the routing plane may operate according to a more centralized model, where components of the network controller run centrally and absorb processing configuration information, building the network topology and programming the forwarding plane into the intelligence required in the virtual router. The virtual router agent is a home agent that processes information programmed by the network controller. This design results in more limited intelligence being helpful at the compute nodes and tends to result in simpler configuration states.

The centralized control plane provides the following:

allowing the proxy routing framework to be simpler and lighter. The complexity and limitations of BGP are hidden from agents. The agent need not understand concepts such as route identifiers, route targets, etc.

The agents exchange only prefixes and construct their forwarding information accordingly.

The control node may do more than routing. They build on virtual network concepts and can use route replication and regeneration to generate new routes (e.g., to support features such as service chaining and inter-VN routing, among other use cases).

Build BUM tree for best broadcast and multicast forwarding.

Note that the control plane has a distributed nature for certain aspects. As a control plane supporting distributed functions, it allows each local virtual router agent to publish its local routes and subscribe to configurations on an as-needed basis.

It makes sense to take into account the control plane design from the tool POV and to use the hand tool appropriately where they are most suitable. Consider a set of advantages and disadvantages of condil-bgp and cRPD.

The following functions may be provided by the cRPD or the control node of the network controller 24.

Routing daemons/processes

Both the control node and cRPD may act as a routing daemon implementing different protocols and having the ability to program routing information in the forwarding plane.

CRPD implements a routing protocol with a rich routing stack that includes an Interior Gateway Protocol (IGP) (e.g., intermediate system-to-intermediate system (IS-IS)), BGP-LU, BGP-CT, SR-MPLS/SRv6, bidirectional Forwarding Detection (BFD), path Computation Element Protocol (PCEP), etc. It may also be deployed to provide control plane only services such as routing reflectors and, due to these capabilities, flow in internet routing use cases.

Control node 232 also implements the routing protocol, but is primarily BGP-based. The control node 232 understands the overlay network. The control node 232 provides a rich feature set in overlay virtualization and caters to SDN usage. Overlay features such as virtualization (using abstraction of virtual networks) and service chaining are very popular among telecommunications and cloud providers. cRPD may not include support for such overlay functions in some cases. However, the rich feature set of CRPD provides strong support for the underlying network.

Network orchestration/automation

The routing function is only a part of the control node 232. The whole part of the overlay network is the orchestration. In addition to providing overlay routing, control nodes 232 also help model organizational functions and provide network automation. Central to the coordination capability of control node 232 is the ability to model network virtualization using virtual network (and related objects) based abstractions that include the VNRs described above. Control node 232 interfaces with configuration node 230 to relay configuration information to the control plane and the data plane. Control node 232 also helps to build up a multicast layer 2 and layer 3 overlay tree. For example, the control node may build a virtual topology of the cluster it uses to achieve this. cRPD typically does not include such orchestration capability.

High availability and horizontal scalability

The control node design is more centralized and cRPD is more distributed. There is a cRPD operating node running on each computing node. On the other hand, the control node 232 is not running computationally and may even run on a remote cluster (i.e., separate and in some cases geographically remote from the workload cluster). The control node 232 also provides horizontal scalability to the HA and operates in an active-active mode. The computational load is shared between the control nodes 232. cRPD, on the other hand, typically does not provide horizontal scalability. Both control node 232 and cRPD may provide graceful restart to the HA and may allow data plane operation in headless mode, where the virtual router may operate even if the control plane is restarted.

The control plane should not be just a routing daemon. It should support overlay routing and network orchestration/automation, while cRPD is good as a routing protocol in managing the underlying routing. However, cRPD generally lacks network orchestration capability and does not provide strong support for overlay routing.

Thus, in some examples, the SDN architecture may have cRPD on a compute node, as shown in fig. 5A-5B. Fig. 5A illustrates an SDN architecture 700, which may represent an example implementation of SDN architecture 8 or 400. In SDN architecture 700, cRPD 324 runs on compute nodes and provides the underlying routing to the forwarding plane while running a centralized (and horizontally scalable) set of control nodes 232 that provide orchestration and overlay services. In some examples, a default gateway may be used instead of running cRPD 324 on a computing node.

cRPD324 on a compute node provides a rich underlying route to the forwarding plane by interacting with virtual router agent 514 using interface 540, which may be a gRPC interface. The virtual router proxy interface may allow programming routes, configuring virtual network interfaces for overlay, and configuring virtual router 506. This is described in more detail in U.S. application Ser. No. 17/649,632. At the same time, one or more control nodes 232 operate as independent pod providing overlay services. SDN architecture 700 may thus obtain the rich coverage and orchestration provided by control node 232 and modern underlying routing by cRPD324 on the compute node to supplemental control node 232. A separate cRPD controller 720 may be used to configure cRPD 324.CRPD controller 720 can be a device/element management system, network management system, orchestrator, user interface/CLI, or other controller. The cRPD324 runs routing protocols and exchanges routing protocol messages with routers that include other cRPD 324. Each cRPD324 may be a containerized routing protocol process and effectively operates as a software-only version of the router control plane.

The enhanced underlying route provided by cRPD324 may replace the default gateway at the forwarding plane and provide a rich routing stack for supportable use cases. In some examples where cRPD324 is not used, virtual router 506 will rely on a default gateway for the underlying route. In some examples, cRPD324 as an underlying routing procedure will be limited to programming only the default inet (6) 0 structure with control plane routing information. In these examples, the non-default overlay VRF may be programmed by control node 232.

In this context, telemetry exporter 561 may be executed to collect MD 64 and export it to telemetry node 560, telemetry node 560 may represent an example of telemetry node 60/260. Telemetry exporter 561 may interface with an agent (not shown for ease of illustration) executing in virtual router 506 and underlying physical hardware to collect one or more metrics in the form of MD 64. The telemetry exporter 561 may be configured to collect only certain metrics that are smaller than all metrics according to the TECD 63, thereby improving operation of the SDN architecture 700 in the manner described in more detail above.

Fig. 7 is a block diagram illustrating the telemetry node and telemetry exporter of fig. 1-5A in more detail. In the example of fig. 7, telemetry node 760 may represent an example of telemetry nodes 60 and 260, and telemetry deriver 761 may represent an example of telemetry derivers 61, 261, and 561.

Telemetry node 760 may define a plurality of customized resources as MG 762 conforming to a containerized orchestration platform (e.g., kubernetes). Telemetry node 760 may define these MGs 762 in the manner described in more detail above by YAML. A network administrator or other user of the SDN architecture may interact with telemetry node 760 via UI 50 (shown in fig. 1) to issue requests to enable and/or disable one or more MGs 762. Telemetry node 760 may reduce enabled MG 762 to a configuration map of enabled metrics, denoted TECD 763. Telemetry node 760 may interface with telemetry exporter 761 to configure telemetry exporter 761 based on TECD763 to export only a subset of enabled metrics defined by the configuration map represented by TECD 763.

Telemetry exporter 761 may then configure an active list of enablement metrics based on TECD763 that limits export function 780 to only exporting enablement metrics specified by a configuration map denoted as TECD 763. Export function 780 may interface with various agents (again not shown for ease of illustration) to configure those agents to collect only metrics specified by the configuration map. The derivation function 780 may then receive only the metrology data for the enablement metrics specified by the TECD763, which in turn causes the derivation function 780 to derive the enablement metrics only in the form of metrology data such as MD 64.

In other words, the system collects hundreds of telemetry metrics for CN 2. A number of metrics may impact the performance and scalability of CN2 deployments and may impact network performance. Exemplary metrics include data plane related metrics (bytes/packets), resource (CPU, memory, storage) utilization, routing information—routes exchanged between peers, and many other metrics.

However, various aspects of the techniques described in this disclosure provide a set of metrics that are new customized resources that provide users with runtime flexibility to define sets of telemetry metrics and to selectively enable/disable the derivation of such sets. Changes to the set of metrics are pushed to each cluster that has been selected for the set of metrics (by default, the set of metrics can be applied to all clusters). A telemetry operator (operator), which as described above may represent a particular one of the custom resource controllers 302, implements a mediator for the set of metrics custom resources and builds a configuration map (which may be referred to as a ConfigMap) from the set of one or more metrics to be applied to the selected cluster. The telemetry operator may then push the ConfigMap into the cluster. A metric agent (e.g., a router agent in a compute node or controller) monitors configuration map changes.

While all metrics may be collected and stored locally by the metrics proxy, the metrics proxy filters metrics according to the set of enabled metrics indicated by ConfigMap and outputs only those metrics belonging to the set of enabled metrics to the collector.

Because the set of metrics is a custom resource, instances of the set of metrics can be dynamically created, accessed, modified, or deleted through a Kubernetes API server that automatically handles the configuration through mediation (as described above).

In some examples, some sets of metrics may be predefined by a network controller provider, a network provider, or other entity. The customer may optionally select certain predefined groups for enabling/disabling during installation or use of the API. Exemplary predefined groups may include predefined groups for controller-info, bgpaas, controller-xmpp, controller-peer, ipv4, ipv6, evpn, ermvpn, mvpn, vroute-rinfo, vrouter-cpu, vruter-mem, vruter-traffic, vrouter-ipv6, vruter-vmi (interfaces), each predefined group having an associated set of relevant metrics.

In this way, the set of metrics provides a high level of abstraction that frees the user from configuring multiple different CN2 components (vrouter, controller, CN2-kube-manager, cRPD, etc.). The telemetry operator maintains a data model of the metrics and sets of metrics and individual associations of the various metrics with their respective associated components. The customer can manipulate which metrics to derive simply by configuring a set of advanced metrics, and the telemetry operator applies the changes appropriately on the different components based on the data model. The client may also apply a different range of metric selections to different entities within the system (e.g., different clusters). If a client experiences a problem with one workload cluster and wants more detailed metrics from that cluster, the client can select a cluster for one or more groups of metrics to allow the user to do so. In addition, the customer may select an appropriate set of metrics (e.g., controller-xmpp or evpn) that may be related to the problem being experienced. Thus, a customer desiring low level detail may enable/select a set of metrics for a particular entity that requires troubleshooting, rather than enabling detailed metrics on the board.

FIG. 8 is a flow chart illustrating operation of the computer architecture shown in the example of FIG. 1 in performing aspects of the technology described herein. As shown in the example of fig. 8, telemetry node 60 may process a request (e.g., received from a network administrator via UI 50) by which to enable one of MGs 62 to derive from the defined one or more logically related elements, the MG defining a subset of one or more metrics from among a plurality of different metrics (1800). Also, the term subset is not used herein in a strict mathematical sense, where the subset may include zero up to all possible elements. Conversely, the term subset is used to refer to one or more elements of less than all possible elements. MG 62 may be predefined in the sense that MG 62 is potentially hierarchically organized by topic to limit collection and export of MD 64 according to defined topics (such as those listed above) that may be relevant to a particular SDN architecture or use case. A manufacturer or other low-level developer of network controller 24 may create MG 62, and a network administrator may enable or disable MG 62 via UI 50 (and may customize by enabling and disabling various metrics within a given one of MG 62).

Telemetry node 60 may convert the subset of one or more metrics to Telemetry Exporter Configuration Data (TECD) 63 based on the request to enable the set of metrics, which configures the telemetry exporters deployed at one or more logically related elements (e.g., TE61 deployed at server 12A) to export the subset of one or more metrics (1802). TECD 62 may represent configuration data specific to TE61, which may vary across different servers 12 and other underlying physical resources, as such physical resources may have a variety of different TEs deployed throughout SDN architecture 8. The request may identify a particular set of logically related elements (which may be referred to as a cluster conforming to a containerized application platform, such as a Kubernetes cluster), allowing telemetry node 60 to identify the type of TE61 and generate a customized TECD 63 for that particular type 61.

Because the request may identify the cluster and/or pod to which to direct TECD 63, telemetry node 60 may interface with TE61 (in this example) via the vruter 21 associated with the cluster to configure TE61 based on TECD 63 to derive a subset of one or more metrics defined by the enabled ones of MGs 62 (1804). In this regard, TE61 may receive TECD 61 and collect MD 64 corresponding to only a subset of one or more metrics defined by the enabled ones of MGs 62 (1806, 1808) based on TECD 63. TE61 may output metric data corresponding to only a subset of the one or more metrics defined by enabled MG 62 to telemetry node 60 (1810).

In this way, various aspects of the technology may implement the following examples.

Example 1. A network controller for a Software Defined Networking (SDN) architecture system, the network controller comprising: processing circuitry; a telemetry node configured to be executed by the processing circuitry, the telemetry node configured to: processing a request by which to enable a set of metrics to be derived from the defined one or more computing nodes forming the cluster, the set of metrics defining a subset of one or more metrics from the plurality of metrics; based on the request to enable the set of metrics, converting the subset of the one or more metrics into telemetry exporter configuration data that configures a telemetry exporter deployed at the compute node to export the subset of the one or more metrics; and interfacing with the telemetry exporter to configure the telemetry exporter to export a subset of the one or more metrics based on the telemetry exporter configuration data.

Example 2. The network controller of example 1, wherein the request defines the customized resource according to a containerized orchestration platform.

Example 3. The network controller of any combination of examples 1 and 2, wherein the request comprises a first request by which to create a first set of metrics defining a first subset of the one or more metrics from the plurality of metrics, wherein the telemetry node is configured to receive a second request by which to enable a second set of metrics defining a second subset of the one or more metrics from the plurality of metrics that overlaps with the first subset of the one or more metrics by at least one overlapping metric of the plurality of metrics, wherein the telemetry node is configured to: the at least one overlapping metric is removed from the second subset of the one or more metrics to generate the telemetry exporter configuration data when configured to convert the subset of the one or more metrics.

Example 4. The network controller of any combination of examples 1-3, wherein the container orchestration platform implements the network controller.

Example 5. The network controller of any combination of examples 1-4, wherein the set of metrics identifies the compute nodes of the cluster, derives the subset of the one or more metrics from the compute nodes of the cluster as a cluster name, and wherein the telemetry node, when configured to convert the set of metrics, is configured to generate telemetry exporter configuration data for a telemetry exporter associated with the cluster name.

Example 6. The network controller of any combination of examples 1-5, wherein the telemetry node is further configured to receive telemetry data representing a subset of one or more metrics defined by the telemetry exporter configuration data.

Example 7. The network controller of any combination of examples 1-6, wherein the telemetry node is further configured to receive telemetry data representing only a subset of the one or more metrics defined by the telemetry exporter configuration data, the subset of one or more metrics including less than all of the plurality of metrics.

Example 8. The network controller of any combination of examples 1-7, wherein the subset of one or more metrics includes less than all of the plurality of metrics.

Example 9. The network controller of any combination of examples 1-8, wherein the subset of one or more metrics comprises one of a Border Gateway Protocol (BGP) metric, a peer-to-peer metric, an Internet Protocol (IP) version 4 (IPv 4) metric, an IP version 6 (IPv 6) metric, an Ethernet Virtual Private Network (EVPN) metric, and a virtual router (vruter) metric.

Example 10. A computing node in a Software Defined Networking (SDN) architecture system, comprising: processing circuitry configured to execute the compute node forming part of the SDN architecture system, wherein the compute node is configured to support a virtual network router and execute a telemetry exporter, wherein the telemetry exporter is configured to: receiving telemetry exporter configuration data defining a subset of one or more metrics of a plurality of metrics to be exported to a telemetry node performed by a network controller; collecting metric data corresponding to only a subset of the one or more metrics of the plurality of metrics based on the telemetry exporter configuration data; and outputting, to the telemetry node, metric data corresponding to only a subset of the one or more metrics of the plurality of metrics.

Example 11. The operational node of example 10, wherein the operational node supports execution of a containerized application platform.

Example 12. The computing node of any combination of examples 10 and 11, wherein the container orchestration platform implements the network controller.

Example 13. The computing node of any combination of examples 10-12, wherein the subset of one or more metrics comprises one of: border gateway protocol metrics, peer-to-peer metrics, internet Protocol (IP) version 4 (v 4) metrics, IP version 6 (IPv 6) metrics, ethernet Virtual Private Network (EVPN) metrics, and virtual router (vruter) metrics.

Example 14. The compute node of any combination of examples 10-13, wherein the SDN architecture system includes a telemetry node configured to be executed by the network controller, the telemetry node configured to: processing a request by which to enable a set of metrics to be derived from one or more defined compute nodes forming a cluster, the set of metrics defining a subset of the one or more metrics from the plurality of metrics, the one or more compute nodes comprising the compute node configured to execute the telemetry exporter; converting the subset of the one or more metrics into the telemetry exporter configuration data based on the request to enable the set of metrics, the telemetry exporter configuration data configuring the telemetry exporter to export the subset of the one or more metrics; and interfacing with the telemetry exporter to configure the telemetry exporter to export a subset of the one or more metrics based on the telemetry exporter configuration data.

Example 15. The computing node of example 14, wherein the request defines the custom resource according to a container orchestration platform.

Example 16. The computing node of any combination of examples 14 and 15, wherein the request comprises a first request by which to enable a first set of metrics defining a first subset of the one or more metrics of the plurality of metrics, wherein the telemetry node is configured to receive a second request by which to create a second set of metrics defining a second subset of the one or more metrics of the plurality of metrics that overlaps with the first subset of the one or more metrics by at least one overlapping metric of the plurality of metrics, and wherein the telemetry node is configured to: when configured to convert a subset of the one or more metrics, the at least one overlapping metric is removed from the second subset of the one or more metrics to generate the telemetry exporter configuration data.

Example 17. The computing node of any combination of examples 14-16, wherein the container orchestration platform implements the network controller.

Example 18. The computing node of any combination of examples 14-17, wherein the set of metrics identifies a cluster from which a subset of the one or more metrics are derived as a cluster name, and wherein the telemetry node, when configured to translate the set of metrics, is configured to generate telemetry exporter configuration data for a telemetry exporter associated with the cluster name.

Example 19. The computing node of any combination of examples 14-18, wherein the telemetry node is further configured to receive metric data representing a subset of the one or more metrics defined by the telemetry exporter configuration data.

Example 20. The computing node of any combination of examples 14-19, wherein the telemetry node is further configured to receive metric data representing only a subset of the one or more metrics defined by the telemetry exporter configuration data, the subset of one or more metrics including less than all of the plurality of metrics.

Example 21. The computing node of any combination of examples 14-20, wherein the subset of one or more metrics includes less than all of the plurality of metrics.

Example 22. The computing node of any combination of examples 14-21, wherein the subset of one or more metrics comprises one of: border gateway protocol metrics, peer-to-peer metrics, internet Protocol (IP) version 4 (IPv 4) metrics, IP version 6 (IPv 6) metrics, ethernet Virtual Private Network (EVPN) metrics, and virtual router (vruter) metrics.

Example 23. A method for a Software Defined Networking (SDN) architecture system, the method comprising: processing a request by which to enable a set of metrics to be derived from the defined one or more computing nodes forming the cluster, the set of metrics defining a subset of one or more metrics from the plurality of metrics; converting the subset of the one or more metrics into telemetry exporter configuration data based on the request to enable the set of metrics, the telemetry exporter configuration data configuring telemetry exporters deployed at the one or more computing nodes to export the subset of the one or more metrics; and interfacing with the telemetry exporter to configure the telemetry exporter to export the subset of the one or more metrics based on the telemetry exporter configuration data.

Example 24. The method of example 23, wherein the request defines the customized resource according to a containerized orchestration platform.

Example 25. The method of any combination of examples 23 and 24, wherein the request comprises a first request by which to create a first set of metrics defining a first subset of the one or more metrics of the plurality of metrics, wherein the method further comprises receiving a second request by which to enable a second set of metrics defining a second subset of the one or more metrics of the plurality of metrics that overlaps with the first subset of the one or more metrics by at least one overlapping metric of the plurality of metrics. And wherein converting the subset of the one or more metrics comprises removing the at least one overlapping metric from the second subset of the one or more metrics to generate the telemetry exporter configuration data.

Example 26. The method of any combination of examples 23-25, wherein the container orchestration platform implements the network controller.

Example 27. The method of any combination of examples 23-26, wherein the set of metrics identifies computing nodes of the cluster, deriving a subset of the one or more metrics from the computing nodes as cluster names, and wherein converting the set of metrics includes generating the telemetry exporter configuration data for the telemetry exporter associated with the cluster names.

Example 28. The method of any combination of examples 23-27, further comprising receiving telemetry data representing a subset of the one or more metrics defined by the telemetry exporter configuration data.

Example 29. The method of any combination of examples 23-28, further comprising receiving telemetry data representative of only a subset of the one or more metrics defined by the telemetry exporter configuration data, the subset of one or more metrics comprising less than all of the plurality of metrics.

Example 30. The method of any combination of examples 23-29, wherein the subset of one or more metrics includes less than all of the plurality of metrics.

Example 31. The method of any combination of examples 23-30, wherein the subset of one or more metrics comprises one of: border Gateway Protocol (BGP) metrics, peer-to-peer metrics, internet Protocol (IP) version 4 (IPv 4) metrics, IP version 6 (IPv 6) metrics, ethernet Virtual Private Network (EVPN) metrics, and virtual router (vruter) metrics.

Example 32. A method for a Software Defined Networking (SDN) architecture system, comprising: receiving telemetry exporter configuration data defining a subset of one or more metrics of a plurality of metrics to be exported to a telemetry node performed by a network controller; collecting metric data corresponding to only a subset of the one or more metrics of the plurality of metrics based on the telemetry exporter configuration data; and outputting, to the telemetry node, metric data corresponding to only a subset of the one or more metrics of the plurality of metrics.

Example 33. The method of example 32, wherein the method is performed by a computing node supporting execution of a containerized application platform.

Example 34. The method of any combination of examples 32 and 33, wherein the container orchestration platform implements the network controller.

Example 35. The method of any combination of examples 32-34, wherein the subset of one or more metrics comprises one of: border gateway protocol metrics, peer-to-peer metrics, internet Protocol (IP) version 4 (v 4) metrics, IP version 6 (IPv 6) metrics, ethernet Virtual Private Network (EVPN) metrics, and virtual router (vruter) metrics.

Example 36. The method of any combination of examples 32-35, wherein the SDN architecture system comprises a telemetry node configured to be executed by the network controller, the telemetry node configured to: processing a request by which to enable a set of metrics to be derived from one or more defined compute nodes forming a cluster, the set of metrics defining a subset of the one or more metrics from the plurality of metrics, the one or more compute nodes comprising the compute node configured to execute the telemetry exporter; converting the subset of the one or more metrics into the telemetry exporter configuration data based on the request to enable the set of metrics, the telemetry exporter configuration data configuring the telemetry exporter to export the subset of the one or more metrics; and interfacing with the telemetry exporter to configure the telemetry exporter to export a subset of the one or more metrics based on the telemetry exporter configuration data.

Example 37. The method of example 36, wherein the request defines the custom resource according to a container orchestration platform.

Example 38. The method of any combination of examples 36 and 37, wherein the request includes a first request by which to enable a first set of metrics defining a first subset of the one or more metrics of the plurality of metrics, wherein the telemetry node is configured to receive a second request by which to create a second set of metrics defining a second subset of the one or more metrics of the plurality of metrics that overlaps at least one overlapping metric of the plurality of metrics with the first subset of the one or more metrics, and wherein the telemetry node is configured to remove the at least one overlapping metric from the second subset of the one or more metrics to generate the telemetry exporter configuration data when configured to convert the subset of the one or more metrics.

Example 39. The method of any combination of examples 36-38, wherein the container orchestration platform implements the network controller.

Example 40. The method of any combination of examples 36-39, wherein the set of metrics identifies the cluster from which the subset of the one or more metrics is derived as a cluster name, and wherein the telemetry node generates telemetry exporter configuration data for a telemetry exporter associated with the cluster name when configured to translate the set of metrics.

Example 41. The method of any combination of examples 36-40, wherein the telemetry node is further configured to receive metric data representing a subset of the one or more metrics defined by the telemetry exporter configuration data.

Example 42. The method of any combination of examples 36-41, wherein the telemetry node is further configured to receive metric data representing only a subset of the one or more metrics defined by the telemetry exporter configuration data, the subset of one or more metrics including less than all of the plurality of metrics.

Example 43. The method of any combination of examples 36-42, wherein the subset of one or more metrics includes less than all of the plurality of metrics.

Example 44. The method of any combination of examples 36-43, wherein the subset of one or more metrics comprises one of: border gateway protocol metrics, peer-to-peer metrics, internet Protocol (IP) version 4 (IPv 4) metrics, IP version 6 (IPv 6) metrics, ethernet Virtual Private Network (EVPN) metrics, and virtual router (vruter) metrics.

Example 45. A Software Defined Networking (SDN) architecture system, the SDN architecture system comprising: a network controller configured to execute a telemetry node, the telemetry node configured to: processing a request by which to enable a set of metrics to be derived from the defined one or more logically related elements, the set of metrics defining a subset of one or more metrics from a plurality of metrics; based on the request to enable the set of metrics, converting the subset of the one or more metrics into telemetry exporter configuration data that configures telemetry exporters deployed at the one or more logically related elements to export the subset of the one or more metrics; and interfacing with the telemetry exporter to configure the telemetry exporter to export a subset of the one or more metrics based on the telemetry exporter configuration data; and logic configured to support the virtual network router and execute a telemetry exporter, wherein the telemetry exporter is configured to: receiving telemetry exporter configuration data; collecting metric data corresponding to only a subset of the one or more metrics of the plurality of metrics based on the telemetry exporter configuration data; and outputting, to the telemetry node, metric data corresponding to only a subset of the one or more metrics of the plurality of metrics.

Example 46. A non-transitory computer-readable storage medium having stored thereon instructions that, when executed, cause one or more processors to perform the method of any combination of examples 23-31 or examples 32-44.

The techniques described herein may be implemented in hardware, software, firmware, or any combination thereof. The various features described as modules, units, or components may be implemented together in an integrated logic device or separately as discrete but interoperable logic devices or other hardware devices. In some cases, various features of the electronic circuit may be implemented as one or more integrated circuit devices, such as an integrated circuit chip or chipset.

If implemented in hardware, the invention may be directed to an apparatus such as a processor or an integrated circuit device (e.g., an integrated circuit chip or chipset). Alternatively or additionally, if implemented in software or firmware, the techniques may be realized at least in part by a computer-readable data storage medium comprising instructions that, when executed, cause a processor to perform one or more of the methods described above. For example, a computer-readable data storage medium may store such instructions for execution by a processor.

The computer readable medium may form part of a computer program product, which may include packaging material. The computer-readable medium may include computer data storage media such as Random Access Memory (RAM), read Only Memory (ROM), non-volatile random access memory (NVRAM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory, magnetic or optical data storage media, and the like. In some examples, an article of manufacture may comprise one or more computer-readable storage media.

In some examples, the computer-readable storage medium may include a non-transitory medium. The term "non-transitory" may mean that the storage medium is not embodied in a carrier wave or propagated signal. In some examples, a non-transitory storage medium may store data that may change over time (e.g., in RAM or cache).

The code or instructions may be software and/or firmware executed by processing circuitry including one or more processors, such as one or more Digital Signal Processors (DSPs), general purpose microprocessors, application Specific Integrated Circuits (ASICs), field Programmable Gate Arrays (FPGAs), or other equivalent integrated or discrete logic circuitry. Thus, the term "processor" as used herein may refer to any of the foregoing structures or any other structure suitable for implementation of the techniques described herein. Additionally, in some aspects, the functionality described in this disclosure may be provided within software modules or hardware modules.

Claims

1. A network controller for a software defined networking SDN architecture system, the network controller comprising:

processing circuitry;

a telemetry node configured for execution by the processing circuitry, the telemetry node configured to:

processing a request by which to enable a set of metrics defining a subset of one or more metrics from a plurality of metrics to be derived from computing nodes of the cluster managed by the network controller;

based on the request to enable the set of metrics, converting the subset of the one or more metrics into telemetry exporter configuration data that configures a telemetry exporter deployed at the compute node to export the subset of the one or more metrics; and

interfacing with the telemetry exporter to configure the telemetry exporter to export the subset of the one or more metrics based on the telemetry exporter configuration data.

2. The network controller of claim 1, wherein the request defines the customized resource according to a containerized orchestration platform.

3. The network controller of claim 1,

Wherein the request comprises a first request, a first set of metrics is created by the first request, the first set of metrics defining a first subset of the one or more metrics from the plurality of metrics,

wherein the telemetry node is configured to receive a second request, enable a second set of metrics by the second request, the second set of metrics defining a second subset of the one or more metrics from the plurality of metrics, the second subset of the one or more metrics overlapping the first subset of the one or more metrics by at least one overlapping metric of the plurality of metrics, and

wherein the telemetry node is configured to: the at least one overlapping metric is removed from the second subset of the one or more metrics to generate the telemetry exporter configuration data when configured to convert the subset of the one or more metrics.

4. The network controller of claim 1, wherein the container orchestration platform implements the network controller.

5. The network controller of claim 1,

wherein the set of metrics identifies the compute nodes of the cluster, derives the subset of the one or more metrics from the compute nodes of the cluster as a cluster name, and

Wherein the telemetry node, when configured to convert the set of metrics, is configured to: generating the telemetry exporter configuration data for the telemetry exporter associated with the cluster name.

6. The network controller of claim 1, wherein the telemetry node is further configured to: telemetry data is received, the telemetry data representing the subset of the one or more metrics defined by the telemetry exporter configuration data.

7. The network controller of claim 1, wherein the telemetry node is further configured to: telemetry data is received, the telemetry data representing only the subset of the one or more metrics defined by the telemetry exporter configuration data, the subset of the one or more metrics including less than all of the plurality of metrics.

8. The network controller of claim 1, wherein the subset of the one or more metrics comprises less than all of the plurality of metrics.

9. The network controller of claim 1, wherein the subset of one or more metrics comprises one of: border gateway protocol BGP metrics, peer-to-peer metrics, internet protocol IP version 4IPv4 metrics, IP version 6IPv6 metrics, ethernet virtual private network EVPN metrics, and virtual router metrics.

10. A method for a software defined networking SDN architecture system, the method comprising:

processing a request by which to enable a set of metrics to be derived from the defined one or more computing nodes forming the cluster, the set of metrics defining a subset of one or more metrics from the plurality of metrics;

converting the subset of the one or more metrics into telemetry exporter configuration data based on the request to enable the set of metrics, the telemetry exporter configuration data configuring telemetry exporters deployed at the one or more computing nodes to export the subset of the one or more metrics; and

11. The method of claim 10, wherein the request defines the customized resource according to a containerized orchestration platform.

12. The method according to claim 10,

Wherein the method further comprises receiving a second request, enabling a second set of metrics by the second request, the second set of metrics defining a second subset of the one or more metrics from the plurality of metrics, the second subset of the one or more metrics overlapping the first subset of the one or more metrics by at least one overlapping metric of the plurality of metrics, and

wherein converting the subset of the one or more metrics comprises removing the at least one overlapping metric from the second subset of the one or more metrics to generate the telemetry exporter configuration data.

13. The method of claim 10, wherein the container orchestration platform implements the network controller.

14. The method according to claim 10,

wherein converting the set of metrics comprises: generating the telemetry exporter configuration data for the telemetry exporter associated with the cluster name.

15. The method of claim 10, further comprising: telemetry data is received, the telemetry data representing the subset of the one or more metrics defined by the telemetry exporter configuration data.

16. The method of claim 10, further comprising: telemetry data is received, the telemetry data representing only the subset of the one or more metrics defined by the telemetry exporter configuration data, the subset of the one or more metrics including less than all of the plurality of metrics.

17. The method of claim 10, wherein the subset of the one or more metrics comprises less than all of the plurality of metrics.

18. The method of claim 10, wherein the subset of one or more metrics comprises one of: border gateway protocol BGP metrics, peer-to-peer metrics, internet protocol IP version 4IPv4 metrics, IP version 6IPv6 metrics, ethernet virtual private network EVPN metrics, and virtual router metrics.

19. A software defined networking SDN architecture system, the SDN architecture system comprising:

a network controller configured to execute a telemetry node configured to:

processing a request by which to enable a set of metrics to be derived from the defined one or more logically related elements, the set of metrics defining a subset of one or more metrics from a plurality of metrics;

Converting the subset of the one or more metrics into telemetry exporter configuration data based on the request to enable the set of metrics, the telemetry exporter configuration data configuring telemetry exporters deployed at the one or more logically related elements to export the subset of the one or more metrics; and

interfacing with the telemetry exporter to configure the telemetry exporter to export the subset of the one or more metrics based on the telemetry exporter configuration data; and

a logic element configured to support a virtual network router and execute a telemetry exporter, wherein the telemetry exporter is configured to:

receiving the telemetry exporter configuration data;

collecting metric data corresponding to only the subset of the one or more metrics of the plurality of metrics based on the telemetry exporter configuration data; and

deriving, to the telemetry node, the metric data corresponding to only the subset of the one or more metrics of the plurality of metrics.

20. The SDN architecture system of claim 19, wherein the request defines a custom resource in accordance with a containerization orchestration platform.