Nov 10, 2016

Networking for Cloud-Native Apps

Cloud-native apps are prevalent in both public and private cloud data centers.  Cloud-native refers to apps composed of micro-services based on Linux containers.  In comparison to VM-based N-tier apps, micro-services based on Linux containers are easier to write, easier to reuse, and easier to maintain. These apps are easier to write (in comparison to VM-based N-tier apps), they are easier to reuse, and easier to maintain.

New Network Challenge: Bandwidth

Container placement is predominantly chosen to optimize compute density rather than a physical affinity to other containers.  That is, CPU utilization is valued over network bandwidth. This implies a much greater east-west traffic burden associated with container operation than with VM-based apps.

Containers place additional bandwidth requirements on the network through the nature of their existence, which is based upon a “kill and resurrect” methodology. There may be hundreds of them relative to the number of VMs present in older app architectures.  Their short life span means that each container is actually instantiated thousands of times more often than a VM.  Therefore, at any given time, there are hundreds more containers in a server than VMs, and the process of instantiating each of them happens thousands of times more often than VMs.

Because containers exist in large numbers to support a given application, they create more network traffic than legacy N-tiered apps.  As a result, the east-west traffic associated with containers is substantially higher in comparison to even the chattiest VM-to-VM traffic.  More network bandwidth is needed to support cloud-native apps.

New Network Challenge: addressing and management

If each compute node runs 10 VMs, then there are 10 IP addresses existing in the compute node for VM interfaces.  In a comparable container environment, there could be 300 IP addresses within the same server, each dynamically reallocated perhaps 10 times per minute.  That’s 3,000 IP address allocations per minute, versus perhaps 10 per year with a VM environment.

A common simplification technique is to run BGP in each compute node, tightly integrating the v-switch with the physical switching infrastructure.  This reduces the address table size burden on physical switches, and eases the complexity of dynamic addressing.

With BGP running in each rack, we have a control plane instance increase of perhaps 40 to 1 (40 servers with a control plane in addition to a single TOR).  This represents a material increase in the aggregate size and complexity of the control plane.


A material increase in east-west traffic associated with cloud-native apps must cause us to rethink oversubscription ratios with each clos.  

This implies a fundamental change in the design of a clos.  A traditional 2-layer clos becomes a 3-layer clos; the server BGP is the leaf, the top-of-rack is the spine, and the old spine is now a super-spine.  Modeling that clos, and monitoring it is hard.  We need a tool.

Secondly, the 40-1 increase in control plane instances is very difficult, along with the challenges of managing associated addressing infrastructure.  We need a tool for that, too.

The Apstra Operating System (AOS®) is such a tool.  We’ll tell you how it works in another blog.

Dave Butler

Vice President, Business Development