It is an exciting time in networking, and Apstra is leading innovation in the area of Intent-Based Networking Automation. Apstra pioneered this concept and was the first to bring it to market with the Apstra Operating System (AOS®). The reason behind this excitement is that businesses need to be responsive and agile to key initiatives; and networking is critical to most of these key initiatives. Business leaders also realize that they need to think of networking in a fundamentally different way. They must move towards a massive simplification of their network operations, enabled by autonomous self-operations. I wanted to be a part of the effort along with the passionate team at Apstra. This is why I am at Apstra. Exciting!
In my debut blog at Apstra, having spent the past 6+ years at Palo Alto Networks, I thought I’d share a few thoughts on how network security fits into this new networking paradigm.
Control and Visibility
I like to think of network security in the context of “control” and “visibility”. It is very hard to secure something you do not have visibility into and you cannot secure things you cannot control.
But before jumping into these aspects of network security, let me highlight some key concepts of AOS which I’m particularly excited about. AOS is a distributed network operating system that allows users to specify intent and is responsible for configuring the network to operate according to the specified intent. In addition, it monitors the network and ensures network operation is compliant with the specified intent. The user-specified intent is used to build a network reference design that is represented as a graph in the AOS store. The network reference design graph models the network elements and relationships between network elements.
AOS then establishes “expectations” for network operation to be compliant to the specified intent. Expectations in AOS are representations of network state expressed as telemetry from network elements. For example interface status, MAC addresses, ARP information, and route information are some examples of raw telemetry that are collected from network elements. Since the network is represented as a graph, applications can use graph queries to get network state information.
We call this concept Live Queries and it offers the ability to query the network topology, set up notifications to detect change in the network, and register callbacks to process these notifications. In order to ensure that the network operates in compliance with the specified intent, AOS collects telemetry from network elements and detects anomalies and processes anomalies (remedial action) using the specified handlers.
As is hopefully clear from my description above, AOS provides unprecedented control and visibility into the network state and network operation.
Now to focus on the security aspect, AOS allows users to specify policy and constraints on various network elements in the network reference design graph. Security is essentially a part of intent. For example, a policy to add a set of allowed destination IP addresses/ports traversing a switch interface can be part of the intent. Another example that is fairly simple but very important is to specify intent based on network element artifacts like NOS version, patch level for software on the device, or other custom artifacts. Once these are specified, any deviations will be tracked and reported as anomalies with associated remedial action, take the device offline, send an alert or trigger a patch update. This is key in today’s environment where keeping network devices updated to have the right level of software and vulnerability patches is critical to network security.
Specifying and extending security constructs with ease
AOS provides built-in services that collect raw telemetry from network elements (e.g. MAC addresses, ARP tables, route tables, etc.) and set up expectations and monitor the state of the network based on the collected telemetry. Telemetry collection is extensible: it allows the collection of custom telemetry from elements and provides the ability to associate the telemetry with the network reference design graph. This could be done by running a live query that is associated with the custom telemetry and setup expectations and anomalies based on the telemetry. This capability is very powerful as the live query provides notifications, and an ability to react to the same in real-time as network state changes.
The ability to set up Custom Telemetry Applications allows users to specify several security constructs for network activity in a data center network that is typically behind a firewall or in a secure zone. For example, it can facilitate the detection of lateral movement inside the network, detect traffic flows that should not be present, movement of mac addresses, interface statistics, etc. All these can be defined with ease in AOS. This way you do not have to deal with the complexity of collecting all network counters and processing the same using big data analytics to detect anomalies.
The ease with which users can add security constructs to intent makes handling complex security tasks very easy. Consider the above example of lateral movement of data in a virtual network where nodes are moving around or being brought up/down on demand. Since AOS creates the network reference design and ensures operation of the network, it has the context to be able to respond to various questions about the network (regardless of the complexity) in the presence of constant change. AOS enables applications to be notified only about the changes that the application cares about in real-time with associated network context. This is a huge shift in the way networks are monitored!
In AOS 2.1, Apstra introduced Intent-Based Analytics and it is available to customers. This extends the concept of raw telemetry described above. It allows users to aggregate raw telemetry from network elements and supports analytics constructs like thresholding and pipelines of data across processing stages. This concept also allows users to create constructs related to the state of network devices along with network data patterns or other network parameters.
An example of that would be to detect server activity (e.g. CPU, I/O, application metrics) and correlate that with the network activity of the server (traffic patterns, interface stats, ports used, etc). The ability to leverage rich context from elements in a network and co-relate across network element parameters, data flows, and network operations to detect anomalies provides unprecedented visibility.
Compare this to sending all your network/server data all the time to external systems. The external system has to process a large amount of data offline and then, if you’re lucky, detect anomalies and come up with remedial actions. This results in a time lag between the time of the anomaly and detection and renders the anomaly detection and remedial actions ineffective in the context of network security. The blog post, Intent-Based Analytics: How can I use them Now? describes how you can configure and use IBA probes.
Massive simplification of network operations through autonomous self-operation
Last but not least, AOS allows the user to specify intent, and then simply operates the network for the user. You could say “build a network with 25 racks and 20 servers in each rack with 10G links and 2:1 oversubscription and ensure that there is no ssh or FTP activity between a set of servers and trigger alerts and deny access if there is a traffic burst from any server that violates the standard deviation of the “tx_bytes” by 30%”. AOS will build a reference design for you and once this is deployed, the network will setup expectations based on the intent and trigger alerts and remedial actions specified. This is the closest the industry has gotten to autonomous network self-operation. Sounds too good to be true? Try this new “refreshingly simple” approach to network operations. Check out AOS.