About a year ago, Google's Amin Vahdat presented their SDN solution, namely the Andromeda incarnation. As you'll see in this year's presentation, it's one in a long list of things they've done in this space, therefore quite interesting.

So Amin again jumped on the stage at ONS and gave an even more interesting presentation about Google's DC networks: some history, requirements/constraints and design goals.

You should really watch the videos, especially the latest one. Below, the bits that caught my attention from a design perspective:

  • operational complexity of "box-centric" deployments is high - individual boxes and individual CLIs don't scale
  • clos topology with merchant silicon (no need for massive RIBs and deep buffering)
  • centralized control because you built the fabric yourself, using software agents:
    • that know how the network looks like
    • that can affect routing along those lines
    • that can react to exceptions from the plan
  • if you know how the network looks like it's so much easier keeping it that way than re-discovering all the time how it is changing
  • need to keep the network from becoming unbalanced:
    • some resource is scarce and it limits your value
    • other resources are idle, increasing your cost
  • at DC scale, the scarce resource is the network
  • latency-wise, Flash needs 100us or expensive servers idle while waiting for IO
  • upgradability at scale - without service impact - upgrade server connectivity, network capacity
  • optics drive the cost of your network (ideally you'd get optical packet switching so you don't need all of this optical-ele-optical-ele nonsense)

And, as always, thanks for reading.


Any comments? Contact me via Mastodon or e-mail.


Share & Subscribe!