This article is more than 1 year old

Hairy situation? Blade servers can reach where others can't

You down with OCP? Yeah you know me

Any follower of today's technology magazines will have heard a lot about Open Compute Project (OCP) servers. These are servers stripped down the bare minimum, crammed into a single chassis and managed centrally through software that provides high automation and data centre-scale orchestration.

In an OCP world, cost reduction is paramount. The result is that everything really works well only at huge scale, and even then where each node is the same as the next.

With everything pared down the minimum, making efficient use of these servers requires that a dedicated team of operations nerds has a chance to script, automate and tweak every last item in-house.

At the opposite end of the spectrum is the legacy server: one server per chassis, designed to handle anything you can throw at it. These servers have video cards, sounds cards, myriad USB ports, SATA, SAS and everything else under the sun.

Legacy servers are far more expensive per node than any other option on the market, but they are also far more versatile. The downside is that each typically requires individual management, and they are the least dense option available.

Somewhere between OCP and legacy we have the "unblades". These are systems such as SuperMicro's Twin servers. Like OCP servers, unblades aim for middling to high density and they share power and cooling.

Unblades are somewhat stripped down, but not nearly as much as OCP nodes. They don't share networking and management is similar to legacy nodes: everything is done on a per-server basis.

It would seem that between these three categories, we have covered the bulk of the market, from pile 'em high, sell 'em cheap through to expensive but versatile. You could be excused for believing there is no room in there for blade servers, but I think you would be wrong.

The big easy

Blade servers are about ease of use, nothing more, nothing less.

Cast about and you get a bunch of marketing mumbo jumbo about "blade servers offer a lower operating cost than running standalone servers" backed by every metric imaginable. Blades reduce data centre footprint, they reduce your overhead by allowing for a wire-once infrastructure… And on and on it goes.

Some of that may be valid, but the real benefit of blades is far simpler to understand. As a class of system, blades are far less of a pain in the ass than anything else on the market.

The secret is in the management software that is baked into the chassis. Though implementations vary, as a general rule blades can be counted on to do a few key things that make everyone's life easier.

The first is that they can assign things like MAC addresses and WWNs on a per-slot basis. They can also generally fail these configurations over to a hot spare node.

Simply pull out the dead node, insert a fresh one and you are back in business

Think about what this means in practice. If your blades are configured to grab their operating system from a SAN instead of from local storage, then all that is needed to ensure that any given node boots up with the relevant operating system instance is for the MAC and/or WWN of the system to be correct.

If a physical blade dies for whatever reason, then the chassis could pick a hot spare from the pool, assign the MAC or WWN to that blade and the system will boot up with the relevant operating system image: basically VMware's high availability, but for physical servers.

If you don't have a hot spare node, that is okay. Blade administration software can be configured to assign MACs and WWNs to given slots in the chassis. Simply pull out the dead node, insert a fresh one and you are back in business.

This ease of use is key to the widespread adoption of blades, and it increasingly underpins the ethos of modern designs.

Sharing the load

If there is another reason to use blades other than the ease of use provided by management software it is that everything on a blade chassis is shared. If there is a reason not to use blades, it is that everything on a blade chassis is shared. This is the yin and yang of blade.

Similar to OCP servers and unblade chassis, blades have shared redundant elements like PSUs and fans. Unlike these others, blade chassis have integrated shared network and chassis-level management. This is all connected via a shared midplane that, in theory, serves a single point of failure.

If you are properly paranoid about single points of failure, then you bought into blades with at least two half-filled chassis, and you will always expand so that across all your chassis there is a minimum of one chassis worth of free space (or hot spare nodes).

This seems expensive when you are talking about two chassis. It is irrelevant if your minimum purchase size is a rack.

If you can live with a small amount of risk, you will probably find that the incidence of chassis failure on modern blade systems is statistically indistinguishably from zero.

Most blade designs have a completely passive midplane: basically some silicon, some copper traces and some lacquer, with no electrical components to wear out.

In the unlikely event that the midplane does die, modern blade designs have field replaceable midplanes. Not exactly hot swap, but not the end of the world either. As for the other bits – fans, networking, administration modules and so forth – you will find blade chassis to be remarkably redundant.

In the meantime, you can cram 16 servers into a single chassis, all connected with an east-west switch ensuring fast communication between all nodes in the chassis and fat north-south ports to ensure fast connectivity to the top of rack switch. They all share redundant power supplies, cooling and so forth.

Bare metal

Blades are well-known for density applications: many CPUs crammed into a small space. If you have the data centre to handle that kind of density then scale out applications, such as high-traffic web services or high-intensity analytics, will be good candidates for running on the metal of your blades.

Memory-hungry applications, or ones with virtualisation licensing that was designed by hellish ogres, will also find a home in blades. Typically we see a panoply of IT alphabet soup running bare on the metal: business intelligence, enterprise resource planning, service-oriented architecture, product data management, online transaction processing and so on.

Companies whose software still typically runs on the metal include Oracle, SAP, Siemens, IBM (WebSphere), and Microsoft (Exchange). In some cases (Oracle) the hellish licensing ogres are the reason. In others (Microsoft) the developers are caught in a time warp and have allowed fear of virtualisation to become a neurosis.

Whatever the reason, there are plenty of workloads that call for physical servers but need the sort of management ease of use, high availability and failover that you typically get only with virtualisation. These workloads shouldn't be run on anything except blades.

If you have a bunch of automated scale out applications managed by a centralised orchestration system, go hard with OCP. Caringo can build you clusters of object-oriented multicasting storage madness by PXE booting bare-bones boxes. I have built Linux render farms along the same lines.

But if you are running workloads where the operating system itself matters – its configuration, data, patch level and so forth – then don't run that on stripped down gear. Either run those workloads on a well-managed virtualisation platform or run them on proper enterprise blades.

If you need local storage acceleration in the blades for your workloads, your vendors should be able to accommodate you. HP's PCI-E ioDrive2 Flash Mezzanine Cards are a great example.

If you have workloads that need lots of local storage, I will point you to a YouTube video of an HP Storageworks Blade. Sometimes seeing makes it all make sense.

If you really, really need high availability, HP also makes a fault-tolerant blade. Although expensive, it is pretty much the coolest thing ever.

Back to the backplane

That the case for metal workloads running on blades is so strong doesn't mean the case for virtualisation on blades is missing. The argument that blades provide high availability for workloads becomes largely irrelevant in a world of virtualisation.

What blade vendors do through chassis admin modules and configuration cloning, VMware does in the hypervisor. Your standard unblade system or even legacy server gets you every bit as far down that road as a full-on blade chassis does.

Blades have an advantage here that other setups don't: that shared networking backplane. A blade chassis is essentially a cluster in a box. When all nodes of a cluster are in the chassis they share a screaming fast backplane for intra-cluster communication and a fat-ish trunk for off-cluster communications.

In the unblade or legacy server world, every node has a bunch of networking spaghetti to deal with, and the costs of servers plus networking can easily run beyond the costs of going to blades.

Plus there is still that sexy management of the physical nodes thing. It is a time saver, even when you have gone to virtualisation. This may not matter much when you have only eight nodes to worry about. It absolutely does when you have eight racks

Clouds everywhere

The future will be increasingly about connectivity in the blade chassis. Webscale servers are increasing demand for 40GbE per blade and high-intensity workloads are coming off traditional SANs and moving to all-flash server SANs for storage. This calls for bigger trunk links off the chassis and better networking within the chassis.

Individual blade nodes (as well as the networking modules) need to worry about emerging standards such as RDMA over Converged Ethernet (RoCE), Virtual eXtensible LAN (VXLAN) and Network Virtualisation using Generic Routing Encapsulation (NVGRE).

Data centres that were designed in the era of low-density legacy servers or mid-density unblade systems will have to be retooled to handle much higher-density blade systems with ever more powerful networking and integrated storage.

The ability to consolidate all your workloads into higher density means nothing if your data centre can't dish out the required power and cooling per rack.

Management of blade systems is also increasingly merging with hyperscale management components. Managing a single chassis is not enough; even a single domain of multiple chassis is too small.

Multi-domain, multi-chassis management that integrates directly with hyperscale systems such as OpenStack, vCloud and Azure on premises will allow blades to challenge OCP servers by making hyperscale clouds something everyone can build, not just those who can afford a room full of orchestration PhDs.

For the foreseeable future, there will be life in the blade concept. ®

More about

TIP US OFF

Send us news


Other stories you might like