MPLS-TE and Traffic Engineering

In my post entitled The rise and maturity of MPLS I said that one of the principle reasons for a carrier to implement MPLS was the need for what is known as traffic engineering in their core IP networks. Before the advent of MPLS this capability was supplied by ATM with many of the world's terrestrial Internet backbones being based on this technology. ATM provided a switching capability in Points of Presence(PoPs) that enabled the automatic switchover to an alternative inter-city ATM pipe in case of failure. I say 'intercity' because ATM was not generally implemented on a transoceanic basis because ATM was deemed to be expensive and inefficient due to its 17% overhead commonly known as cell tax (Picture: Aria networks planning software. (Presentation to the UK Network Operators Forum 2006).

IP engineers were keen to remove this additional ATM layer and replace it with it a control capability which became MPLS. However MPLS in its original guise did not really 'cut the mustard' for use in a traffic engineered regime so the standard was enhanced through the release of extensions known as MPLS-TE. This post will look at Traffic Engineering and its sibling activity Capacity Planning and their relationship to MPLS.

Capacity planning

Capacity planning is an activity that is undertaken by all carriers on a 'regular' basis. Nowadays, this would most likely be undertaken on an annual basis, although in the heady days of the latter half of the 90s of heavy growth it was unlikely to be undertaken on less than a quarterly basis.

Capacity planning is an important function as it directly drives the purchase of additional network equipment or the purchase or building of addition bandwidth capacity for the physical network. Capacity planning is undertaken by the planning department and principally consists of the following activities:

Network topology: The starting point of any planning exercise is to profile the existing network to act as a benchmark to build upon. This consists of two things. The first is a complete database of the nodes or PoPs and network link bandwidths i.e. their maximum capacities. This sounds easier than it is in reality. In many instances carriers do not know the full extent of the equipment deployed which is often the result of one too many acquisitions. This discovery of assets can either be based an on-going manual spreadsheet or database exercise or software can be used to automatically discover the up to date installed network topology. Another way is the export network configurations from network equipment such as routers.

Traffic Matrices: What is needed next is detailed link resource utilisation data or traffic profiles for each service type. This is often called traffic Matrices. Links are the pipes interconnecting a network's PoPs and their utilisation is how much of the links' bandwidth is being used. As IP traffic is very dynamic and varies tremendously according to the time of day, good traffic engineering leads to good operational practice such as never loading a link beyond a certain percentage - say 50%. Every carrier would have their own standard which could be quite easily be made higher to save money but could risk poor network performance at peak times. Clearly, engineers and accountants have different perspectives in these discussions! (Raw IP traffic flow: Credit. Cariden.)

Demand Forecast: At this point, capacity planning engineers make a request to their product marketing and sales brethren with a request for a service sales forecast for the next planning cycle which could between one and three years. If you talk to any planning engineer I'm sure you will hear plaintive cries such as "I can't plan unless I get a forecast" however, can you think of a worse group of individuals to get this sort of information from than sales people? I would guess that this is one of the biggest challenges planning departments face.

Once topology, current traffic matrices and forecasts for each service (IP transit, VoIP, IP VPNs IPTV etc.) has been obtained then the task of planning for the next capacity planning period can begin. This results in - or should result in a clear plan for the company that covers such issues as:

  • What existing link upgrades are required

  • What new links are required

  • What new or expansion to backup links are required

  • What new Capital Expenditure (CAPEX) is required

  • What increase in Operational Expenditure (OPEX) is required

  • What new migration or changeover plans are required

  • Lots of management reports, spreadsheets and graphs

  • Caveats about the on-going unreliability of the growth forecasts received!

Traffic engineering (TE)

While Capacity Planning is a long-term forward looking activity that is concerned with optimising network growth and performance in the face of growing service demand, traffic engineering is focused on how the network performs in respect of delivering services at a much finer granularity.

Traffic engineering in networks has a history as long as telephones have been around and was closely associated with A.K. Erlang. One of the fundamental metrics in the voice Public Switched Telephony Networks (PSTN) was named after him- the Erlang. An Erlang is a measure of the occupancy or utilisation of voice links regardless of whether traffic was flowing is not. Erlang-based calculations were / are used to calculate Quality of Service (QoS) and the optimum utilisation of fixed bandwidth links taking into account the amount of traffic at peak times.

Traffic engineering is even more important in the highly dynamic world of IP networks and carriers are able to experience a considerable number of benefits if traffic engineering is taken seriously by management.

  • Cost optimisation: Providing network links is an expensive pastime by the time you take IP equipment, optical equipment, OSS costs and OPEX into account. The more that a network is fully utilised without degradation, the more money can flow to the bottom line.

  • Congestion management: If a network is badly traffic engineered either through under-planning, under-spending or under-resourcing, the more chance there is for network problems such as outages or congestion to impact a customer's experience. The telecoms world is stuffed full of examples of where this has happened.

  • Dynamic services and traffic profiles: Traffic profiles and flow can change quite considerably over a period of time when new services with different traffic profiles are launched without involving network planners. In an age when when there is considerable management pressure to reduce new time-to-market timescales this can happen more often that many companies would admit to.

  • Efficient routing: In MPLS and the limitations of the Internet I wrote about about one of the strengths of the IP protocol was that a packet could always find a path to its destination if one existed but that strength created problems when the service required predictable performance. Traffic engineered networks provide paths for critical services that are deterministic / predictable from a path perspective and from a Quality of Service (QoS) perspective. It would not be an overstatement to say that this is pretty much mandatory in these days of converged Next Generation networks.

  • Availability, resilience and fast restoration: If a network's customers sees an outage at any time, the consequences can be catastrophic from a churn or brand image perspective so high Availability is a crucial network metric that needs to be monitored.. There is a tremendous difference in perceived reliability between PSTN voice network and IP networks. For example, tell me the last time your home telephone broke down? It's not that PSTN networks are more reliable that IP networks, they're not, it's just that those PSTN networks have been better designed to transparently work around broken equipment or broken links. Subscribers, to use that old telephony term, are blissfully unaware of a network outage. Of course, if a digger cuts through a major fibre and the SDH backbone ring that is not actually a ring... Well, that's another story.

  • QoS and new services: Real time services need an ability to separate latency-critical services such as VoIP from non-critical services such as email. Traffic engineering is a critical tool in achieving this.

Multi-protocol Label Switching - Traffic Engineering (MPLS-TE)

The  term '-TE' is used in relation to describe other services as well, notably in the attempts to make Ethernet carrier grade in quality which is discussed in my posts on PBB-TE and T-MPLS which is being built on the back of MPLS-TE (Picture credit OpNet planning software).

As mentioned above, before the advent of MPLS-TE, carriers of IP traffic relied on the underlying transport for traffic engineering - networks such as ATM. MPLS-TE consisted of a set of extensions to MPLS that enabled native traffic engineering within an MPLS environment. Of course, this does not remove the need to traffic engineer any layer-1 transport network that MPLS may be transported over. MPLS-TE was covered in in the IETF RFC 2702 standard.

What does MPLS-TE bring to the TE party?

(1) Explicit or constraint-based end-to-end routing: The picture below shows a small network where traffic flowing from the left could travel via two alternative paths to exit on the right. This was specifically the environment that Internal Gateway Routing (IGP) routing algorithms such as Open shortest Path (OSPG) and Intermediate System - Intermediate System (IS-IS) were designed to operate in by specifically routing all traffic over the shortest path as shown below  (Picture credit: NANOG).

This inevitably could lead to problems with the north path shown above becoming congested while the south path remains unused wasting expensive network assets. Before MPLS-TE, standard IGP routing algorithm metrics could be 'adjusted', 'manipulated' or 'tweaked' to  reduce this possibility, however, doing this could be very complicated and be very challenging to manage on a day-to-day basis. Such an approach usually required a network-wide plan. In other words, it is a bit of a horror to manage.

With MPLS-TE, using an extension to signalling protocol Resource Reservation Protocol (RSVP) known as, not unsurprisingly as RSVP-TE, explicit paths can be set up to force selected traffic flows to flow over through them as shown below.

This deterministic routing helps with reducing congestion on particular links, helps load the network more evenly thus reducing the number of 'orphaned links', ensures optimal utilisation of the network, helps planners separate latency dependent services from non-critical services and better manage upgrade costs (Picture credit: NANOG).

These paths are called TE tunnels or label switched paths (LSPs). LSPs are unidirectional so two need to be specified to handle bi-directional traffic between two nodes.

(2) Constraint Based Routing: Network planners are now able to undertake what is known as Constraint Based Routing where traffic paths can be computed (aka, Path Computation) that meets certain constraints other than the path with the least number of nodes or PoPs as drives OSPF and IS-IS. This could be links with the least utilisation, least delay, with the most free bandwidth, or links that utilise a carriers own, rather than a partner's infrastructure .

(3) Bandwidth reservation: MPLS-TE DiffServe (DS-TE) enables per-class TE across an MPLS-TE network. Physical interfaces and TE tunnels / LSPs can be told how much bandwidth can be reserved or used. This can be used to dynamically allocate, share and adjust over time bandwidth given to critical services such as VoIP and to best effort traffic such as Internet browsing and email.

(4) Fast Re-Route (FRR): MPLS-TE supports local rerouting around a faulty node (node protection) or faulty link (Link Protection). Planners can define alternative paths to be used when failure occurs. FRR can reroute traffic in tens of milliseconds minimising down time. However, although FRR sounds like a good idea, the amount of computing effort required for calculating FRR paths for a complete network is very significant. If a carrier does not have the appropriate path computation tools, using FRR could cause significant problems by rerouting traffic non-optimally to segment of network that is already congested rather than a segment that is under-utilised (Picture: An LSP tunnel and its backup, Credit: Wandl).

There are other additions to MPLS covered by the MPLS-TE extensions but these are minor compared to ones described above.

Practical use of MPLS-TE

One would imagine that with all the benefits that would accrue to a carrier by using MPLS-TE such as enhancing service quality, enhancing new service deployment and reducing risk, carriers would flock to using MPLS-TE. However, this is not necessarily the case as, with all new technologies, there are alternatives such as:

  • Traditional over-provisioning: Traffic engineering management can be a very complicated task if you attempt to analyse all the flows within a large network. One of the traditional ways that many carriers get round this onerous and challenging task is to simply well over-provision their networks. If a network is geographically constrained or is simply simple, then throwing bandwidth at the network can be seen as a simple and unchallenging solution. Network equipment is so much cheaper than it used to be (and smaller carriers can buy equipment from eBay - cough, cough!). Dark fibre or multiGbit/s links can be bought or leased relatively cheaply as well. So why bother putting in the effort to traffic engineer a network properly?

  • The underlying network does the TE: Many carriers still use the underlying network to traffic engineer their networks. Although ATM is not around as much as it used to be, SDH still is.

  • Stick with IGP adjustment: Many carriers still stick to simple IGP metric adjustment discussed earlier as it handles the simple TE activities they require. True, many would moan about how difficult to manage this is but to migrate to an MPLS-TE environment could be seen as a costly exercise and they currently do not have the time, resource or money to undertake the transitiuon.

  • Let's wait and see: There are so many considerations and competitive options that the easiest decision to make is to do nothing.

Round up

IP traffic engineering is a hot subject and brings forth a considerable variety of views and emotions when discussed. Many carriers have stuck with methods that they have outgrown but are hesitant about making the jump thinking that something better will come along. Many try to avoid the issue completely by simply over-provisioning and taking a an an ultra-KISS approach.

However, those carriers that are truly pursuing a converged Next Generation architecture with all services based on IP and legacy services carried by pseudowire tunnels, cannot avoid the issue of undertaking real traffic engineering to the degree undertaken by the old PSTN planning departments. To a great extent, this could be seen as the wild child of IP networks growing up to become a mature and responsible adults. The IP industry has a long way to go though as creating standards can be seen as difficult enough but getting people to use them is something else!

Whatever else, simply sitting and waiting is not the solution...

Addendum: Companies that supply capacity planning, path computation and network optimisation software:

Aria Networks (UK)

"the economics of network control" (USA)

"Making networks perform" (USA)
"You have one network, We have one plan" (USA)

"Wandl wide area design laboratory" (USA)

Addendum: The follow-on article to this post is: Path Computation Element (PCE): IETF�s hidden jewel

Back to home