Network services, the problem statement (part 3)
If you’ve been following my last couple of blog posts, you know that I’ve been covering a 3 part series on virtualizing network services, the problems they solve and how to balance the notion of cost savings and multi-tenancy. The last two posts were about the problem and multi-tenancy. Finally, let's look at where Embrane fits in this picture.
A servant of many masters
Forrest Gump’s momma wisely predicted that “the cloud market is like a box of chocolate, you never know the next customer you’ll get”. And service providers have to be prepared to serve customers that need villas, customers that need a room in a college dorm, and customers that will just be happy with their little one bedroom condo. The only trait all customers share is that over time, their situation will evolve and their needs will change. In the worst case, they will just go away on a very short notice, but in any case, the service provider will be left with all the extra real estate to repurpose (be they in the form of physical boxes, in the form of “shares” of a box or in the form of virtual appliances license keys).
Now, before you forget this is not a real estate blog, let’s get back to L4-7 network services. It’s tempting to try and share the same infrastructure among all classes of customers, regardless of their needs. After all, isn’t “cloud” about sharing an underlying physical infrastructure among a large number of users to take advantage of the efficiency gained by maximizing utilization of each single piece of hardware in a data center? If you want to build an undifferentiated, best-effort commodity cloud, the answer is “yes”. But if your goal is to offer multiple service levels tailored to different classes of use cases and customers, then the answer is probably more complex.
Sharing appliances is hard. Best practices use knowledge of usage patterns (input to the analysis) to identify the optimal appliance characteristics (output of the analysis) for each specific use case. Unfortunately, if you are offering cloud services, your ability to predict usage patterns is challenged by your inability to predict who’s going to use your services (let alone what for).
If you’re dealing with a L2/3 network, you use statistical analysis to define appropriate input metrics for packet size mixes and inter-arrival times, or rely on things like IMIX (http://en.wikipedia.org/wiki/Internet_Mix) and cross your fingers. The underlying assumption is that L2/3 processing of individual packets is relatively simple, and basic switching overhead can be defined as just a function of those two parameters. “Usage patterns” are synonyms with “traffic patterns”: if you get a bigger share of large packets your throughput goes one way, if you get more small packets your throughput goes another way; the longer the bursts, the heavier the impact on throughput and latency. Mastering predictions of packet size mixes and inter-arrival times enables you to perfect how efficiently your individual switches and routers are used. It doesn’t take that much to make everyone happy, does it? (Well, at least until the day someone enables ERSPAN on that switch, whoops.)
When it comes to L4-7, “usage patterns” are not defined by traffic patterns alone; features configuration has a dramatic impact on the performance of your device. In other words, processing overhead for a packet (or a flow) is heavily influenced by how you’ve configured the device. While I’m not revealing any earth-shattering new truth here, people tend to forget what that implies when it comes to the exercise of sharing L4-7 network services across multiple tenants. RBAC is great, but it’s the data path that will cause all the headaches. Each tenant won’t only impact performance of the box by injecting irregular, unpredictable traffic patterns; they will impact performance also by injecting widely different processing overheads on their flows (i.e., configuring features). The idea that you can use L2/3 principles to share your L4-7 devices is flawed by blindness to this latter fact; the workaround, don’t let tenants configure the “expensive” features.
You must have heard about the nightmares of dealing with the traffic mix of gaming applications (many many tiny little packets). Your roaring 4Gbps firewall on its knees at 600Mbps for those darn 50-odd bytes packets (assuming 1.5Mpps), with good riddance to the other tenants riding on the same firewall. It feels like a DoS, instead it’s just the joy of not knowing who’s coming for dinner. And just to make sure I set your expectations straight, this example belongs to the “simple problem” of L2/3 sharing (a basic bandwidth allocation problem), not the harder problem of L4-7 sharing.
Long story short, if you’re not content with the risks of best effort, we advocate the use of dedicated devices for your L4-7 needs. But, of course, there are catches.
If you take a “static” view of the infrastructure, dedicated physical appliances (the first quadrant) are not economical for most cloud use cases, while dedicated virtual appliances (the second quadrant) have limited applicability due to their inherent CPU resource constrains. The gap between the cost to tenants of a physical appliance and the performance ceiling of a virtual appliance is wide, and that’s one way to look at the discontinuity between the first and the second quadrant.
If you take the “dynamic” view of the infrastructure, supporting customers’ evolving needs is not pain-free with either physical or virtual appliances. While we got used to the idea that to scale an application you just need to add more servers (and forget that the application must have been designed that way for this neat trick to work), scaling a L4-7 network service for a specific tenant is not as simple as throwing more appliances at the problem.
So, while Embrane recognizes the significant value of dedicated tenancy of appliances for L4-7 functions, it also recognizes the strict limits imposed by the existing form factors (physical and virtual appliances), which prevent fruition for all but a very small subset of use cases. Anything that doesn’t fit has to be pushed to shared tenancy. That’s an awful lot of use cases, a.k.a. missed opportunities for potential customers that won’t pick your cloud.
Closing the wide gap between first and second quadrant becomes an important call to open the road toward dedicated tenancy in the cloud. A call to look at the inflexibility of the existing form factors and to create an alternative that puts flexibility of deployment at the top of the agenda.
Where do we go from here?
Until yesterday, the 4 isolated quadrants represented the harsh reality of the static and discontinuous universe of L4-7 network services. If you’re a service provider, given a tenant, pick one form factor and one tenancy model, and you’ve confined the tenant to one quadrant. The tenant’s needs will change, while your ability to fulfill them with some continuity will be limited to the quadrant you initially picked. Over time you will end up with the wrong tenant in the wrong quadrant: if you choose to move her to another quadrant, that signifies discontinuity for the tenant; if you don’t, it signifies inappropriate service levels (and eventually dissatisfaction) for that tenant and many others in the same quadrant.
Sometimes service providers unknowingly pick a single quadrant for all their customers, and build the entire infrastructure confining all their network services to that quadrant. They can claim differentiation because they picked a different quadrant than their competitors. Unfortunately, offering network services that span across multiple quadrants typically means building separate network services infrastructures, separate processes and different orchestration tools per quadrant.
Embrane heleos was built with the goal to enable flexible L4-7 network services deployments and remove the discontinuities across quadrants. The same physical infrastructure can host L4-7 network services for different classes of customers, and follow each customer’s requirements as they evolve. You’ll be able to offer your customers a college dorm when they study, a one bedroom apartment after college, a 3 bedrooms when their first child will be born, or a real villa if that’s the lifestyle they want. Instead of moving them from house to house, rooms will join their current homes when they need them. And, once you know you don’t have to worry about hot water, you’ll be able to let them enjoy all the features they want from the L4-7 network services vendors they trust, running on the heleos platform.