Organic growth is great for farming, but not a network. Designers should build with a specific topology in mind for maximum resilience and faster troubleshooting.
Could you tell another network engineer what the convergence characteristics of your network are? Will you know precisely where to look for problems at 2:00 a.m. when dealing with the dreaded call about a bunch of application sessions that have been dropped?
If your network has grown organically for many years -- here a link, there a link -- then you probably can't. The solution? Let's begin here: Ban organic network growth.
If that sounds radical, think about the problem from another angle. We don't tolerate adding routers and links ad hoc in our data centers; we have specific plans for growing a data center network, and concrete ideas on how large we'll let it get before we build anew. Why should we let the rest of our network grow randomly through organic accretion?
Network engineers need to start by thinking carefully about the relationship between network topologies and requirements. Let's look at a one question in this area as an example: How do we build networks that converge very quickly, without risking a control-plane meltdown?
We normally look at this problem through the lens of routing protocol features -- faster timers, exponential backoff, faster calculations, precalculating alternate paths, and tunneling around problems. But what about the design side?
Will one specific topology converge more quickly than another? If so, why? How can we work our designs around choosing the best converging topology for any given protocol? And what about aggregation and summarization -- do they always improve convergence? And if they do, at what cost?
As an example, consider the humble ring topology. Moving from three routers in a ring to five routers in the same ring will create short loops, called microloops, in OSPF and IS-IS. You might not see much of an increase in convergence time with a link state protocol, but you will see higher jitter numbers, and possibly even dropped packets.
For completely different reasons, increasing a ring from three to five routers will have a dramatic effect on EIGRP convergence, forcing the protocol to drop into the query process to find an alternate path, and possibly more than doubling the convergence time.
Which topology provides more scaling flexibility? Are there specific topologies that have a definite size limit, and others which can be scaled "almost infinitely?" Or are all topologies constrained by some set of scaling limits?
Which topology will provide the most flexibility in traffic engineering options? The price of moving traffic onto specific links, or in management traffic levels on each link, is not only added control plane complexity, but also increased stretch -- the average number of hops a packet must travel to pass from source to destination. Are some topologies more able to support traffic engineering with minimal added stretch?
I'll talk about the intersection of routing protocol operation, complexity, and network topology at Interop Las Vegas in the session Converging on Topology: Laying the Floorplan. Come join me and think about network design in a different way.
Register now for Interop Las Vegas. Use the code SMBLOG to get $200 off the current price of Total Access and Conference Passes.
Russ White is a Principal Engineer in the IPOS team at Ericsson. He has worked in routing protocols and routed network design for the last 15 years. Russ has spoken at Cisco Live, Interop, LACNOG, and other global industry venues, is actively involved in the IETF and the ... View Full Bio
5 Top Federal Initiatives For 2015As InformationWeek Government readers were busy firming up their fiscal year 2015 budgets, we asked them to rate more than 30 IT initiatives in terms of importance and current leadership focus. No surprise, among more than 30 options, security is No. 1. After that, things get less predictable.
InformationWeek Tech Digest, Nov. 10, 2014Just 30% of respondents to our new survey say their companies are very or extremely effective at identifying critical data and analyzing it to make decisions, down from 42% in 2013. What gives?