Routing protocols and architectures/Introduction to Software-Defined Networks

Previous page
Software-based packet filtering
Routing protocols and architectures
Introduction to Software-Defined Networks

Internet is still the one which was defined 30 years ago: a very efficient pipe which transports bits at high speed, with almost the same protocols and the same philosophy.

SDN components.

Network devices are monolithic: every router contains, besides specialized hardware for packet forwarding, its own operating system and its own applications. This infrastructure is closed to innovations: software components can not be installed by the customer but are set by the hardware manufacturer, which is not interested in innovating if it is the market leader (i.e. Cisco).

Software-Defined Networks (SDN) introduce the possibility to program the network, and are based on three pillars:

  • separation of control and forwarding features: software, the smart component, is split from hardware;
  • centralization of control: the whole network is coordinated by a controller, made up of a network operating system and user-defined network applications (e.g. routing protocol, load balancer, firewall);
  • well-defined interfaces:
    • northbound: the network operating system exposes APIs to network applications;
    • southbound: the network operating system drives network nodes, made up of simple packet forwarding hardware.

The network operating system is a software layer that offers a global, abstract view of the network to upper applications. The view from 'above' the network enables for example traffic engineering: decisions are taken by the centralized logic of the load balancer and are therefore coherent for all network nodes:

  • proactive mode: before the device starts forwarding packets, the controller fills a priori the forwarding table with all the rules needed for all sessions;
  • reactive mode: when the first packet in a session arrives, the device sends it to the controller, which takes a decision and notifies to the device the rule needed to forward packets in that session.

A network slicing layer can show to software even a network topology other than the actual physical infrastructure: it can be configured so as to show to every system operating instance different virtual topologies (e.g. a subset of actual links) → traffic policies of a certain company affect only the network portion belonging to the company.

Issues
  • controller: it may constitute a single point of failure and a bottleneck;
  • versatility: a firewall needs to inspect all packets, not only the first packet in the session → a lot of traffic would be generated between the device and the controller;
  • scalability: forwarding hardware can not be too simple in order to get high performance;
  • economy: hardware simplification goes against economic interests of major network vendors.

OpenFlow edit

OpenFlow, introduced around 2008, is an implementation of the southbound interface.

It can be deployed in various ways:

  • rules: typically they are flow-based, that is defined based on the (MAC addresses, IP addresses, TCP ports) tuple;
  • controller: typically it is physically centralized, but it could even be physically distributed (even though still logically centralized);
  • mode: typically it is reactive, but nothing prevents from using the proactive mode.

One or more actions are associated to each rule, for example:

  • forward packet to port(s);
  • encapsulate and forward to controller;
  • drop packet;
  • send to normal processing pipeline (i.e. the classical routing table);
  • modify fields (e.g. NAT: change addresses and ports).
OpenFlow 1.3

It introduced some interesting features:

  • the forwarding table is split into various subtables (e.g. firewall, routing, etc.) and every application accesses its subtable → each packet is matched multiple times across the tables in sequence;
  • virtual switch (vSwitch, e.g. Open vSwitch): instead of being implemented in hardware, OpenFlow is run on a switch emulated by a software process → a GRE logical tunnel can be created between two vSwitches on two different servers across a traditional switch network:   #Network Function Virtualization.
Issues
  • data plane: it only deals with packet forwarding → it is suitable for environments (e.g. datacenters) where packet forwarding is a preponderant aspect with respect to the data plane, but it does not appear to be proper for an ISP network;
  • usefulness: the southbound interface is less interesting than the northbound one: it is used by network operating system developers, not by application developers;
  • hardware cost: rules can be based on a large number of fields which make entries very wide → needed TCAMs are expensive and heat a lot;
  • flexibility: as opposed to Open Networking Foundation (ONF) (VMware), OpenDaylight project (Cisco) prefers Network Configuration Protocol (NETCONF) which, instead of make rules explicit, does not know the semantics of values read or set → it can be used by the SDN controller to configure some advanced features on devices, such as 'backup routes' in case of faults detected to be critical over an ISP network.

Data plane edit

It is not only important to forward packets to the right direction, but also to offer data-plane-oriented services which process packets (e.g. firewall, NAT, network monitor).

Service Function Chaining without SDN edit

 
Service Function Chaining (SFC) without SDN.

Nowadays services can be added to access routers (BNG), as well as by service cards, by connecting boxes called appliances: an appliance is a separate and discrete hardware device with integrated software (firmware) dedicated to provide a specific service. Appliances are connected in a cascade by physical wires forming a static service chain, and each packet has to be processed throughout services before being able to exit the device.

Disadvantages
  • agility in provisioning new services: the appliance should be physically connected to the device;
  • flexibility: in order to connect a new appliance the chain needs to temporarily be broken stopping the network service;
  • reliability: a faulty appliance breaks the chain stopping the network service;
  • optimization: each appliance has a fixed amount of resources available, and during work peaks it can not exploit resources possibly left free in that moment by another appliance.

Service Function Chaining with SDN edit

 
Service Function Chaining (SFC) with SDN.

Every appliance is connected to an output port and an input port of an OpenFlow switch, and traffic flows cross a service chain dynamically decided through OpenFlow rules defining paths from a switch port to another.

Advantages
  • flexibility: adding a new appliance requires to change on the fly the OpenFlow rule by the SDN controller without stopping the network service;
  • reliability: an on-the-fly change to the OpenFlow rule by the SDN controller is enough to restore the network service;
  • business: paths can be differentiated based on the customer (company) → traffic goes only across services which the customer has bought.
Disadvantages
  • agility in provisioning new services: the appliance should be physically connected to the device;
  • optimization: each appliance has a fixed amount of resources available, and during work peaks it can not exploit resources possibly left free in that moment by another appliance;
  • backward compatibility: devices should be replaced with switch supporting OpenFlow.

Network Function Virtualization edit

 
Network Function Virtualization (NFV).

Services are implemented in a purely-software process: the switch is connected to OpenFlow vSwitches emulated on multiple remote servers, and each server has a hypervisor able to run virtual machines (VM) inside which services are running.

Scaling

Performance of a service can be enhanced in three ways:

  • scale up: the VM is assigned more hardware resources → this may not be enough if the service is not able to properly exploit the available hardware (e.g. a single-thread program does not benefit much from a multi-thread environment);
  • scale out: multiple VMs are running in parallel on a same physical server → a load balancer is needed to send traffic to the least-loaded VM, and VMs need synchronization;
  • multiple servers: multiple VMs are running in parallel on multiple physical servers → a further load balancer is needed to send traffic to the least-loaded server.
Advantages
  • agility in provisioning new services: a new service can be dynamically enabled by downloading and starting its software image;
  • optimization: server hardware resources are shared among VMs;
  • backward compatibility: if the switch does not support OpenFlow, the GRE tunnel between vSwitches can be exploited without having to replace the device;
  • consolidation: by night the number of VMs running in parallel can be reduced (scale in) and the assigned hardware resources can be decreased (scale down).
Disadvantages
  • traffic: the classical NFV model may require packets to travel from a server to another across the switch, clogging the network which servers are spread over;
  • efficiency: servers have general-purpose CPUs, not dedicated hardware (e.g. line cards), and effective hardware-acceleration technologies are not currently available;
  • migration: when the user moves, the VM instance should be moved to the closest server and should be started as soon as possible;
  • scalability: the architecture is potentially very scalable, but suffers from synchronization and load balancing problems when multiple service instances are running in parallel.

OpenStack edit

 
OpenStack system components.

OpenStack, introduced in 2010, is an open-source distributed operating system:

  • Linux:
    • it handles the single local host which it is running on;
    • the process is the execution unit;
  • OpenStack:
    • it is run on a remote server, called controller node;
    • it handles multiple distributed physical servers in the cloud, called compute nodes;
    • the virtual machine is the execution unit.

Each compute node includes the following components:

  • traditional operating system: it handles the local hardware on the physical server;
  • agent: it receives commands from the controller node, for example to launch VMs;
  • vSwitch (e.g. Open vSwitch): it connects the server to the network infrastructure.

One of the tasks of the controller node is to launch VMs on the currently least-loaded compute node.

 
Software-based packet filtering
Routing protocols and architectures
Introduction to Software-Defined Networks