Welcome!

Blog Feed Post

HAL in the Datacenter

Full Datacenter Automation – minus the AI (for now)

Full Datacenter Automation – minus the AI (for now)

In Arthur C Clark’s 2001: A Space Odyssey, HAL 9000 was the AI. Everyone knows that. But more relevant to todays’ automation efforts in the datacenter, HAL also controlled all the systems on the space ship. That, in the long run, is what we’re headed for with operations automation and DevOps. We want not just management tools that are automated, but orchestration tools that are too, and the more automation the better. In the end, it will require a human to replace physical hardware, but all software related issues from the beginning of installation to the end of life for the app would ideally be automated with as little human intervention as possible.

This still makes some operations folks antsy. Frankly, it shouldn’t. While the ability to quickly deploy systems, upgrade systems, and fix some common problems – all automated – will cut the hours invested in deploying and configuring tools, it will not cut hours. The reason is simple… Look at what else we’re doing right now. Cloud is being integrated into the DC, meaning we have double the networks and security systems to maintain, if the cloud is internal we have another whole layer of application to maintain – today that is easier with VMware than OpenStack, but neither is free in terms of man-hours, and we’re working on continuous integration. That’s before you are doing anything specific to your business/market. But enough, I digress.

The thing about server provisioning, no matter how thorough it is, is that application provisioning is a separate step. If you’ve been reading along with my thoughts on this topic here and at DevOps.com, then you know this already.

By extension, the thing about Full Layered Application Provisioning – FLAP as presented at DevOps.com – is that it too leaves us short. You have a server, fully configured. You have an application (or part of an application in clustered services scenarios, or multiple applications), and it is ready to rock the world. Totally configured, everything on that box from the RAID card to the App GUI is installed and configured… But the infrastructure hasn’t been touched.

This is a problem most of the marketplace recognizes. If you look close at application provisioning tools like Puppet and Chef, you can see they are integrating networking infrastructure configuration into application provisioning through partnerships.

This is a good thing, but it is not at all clear that application provisioning is the right location in the operations automation stack to put this type of configuration. While you could make a case that the application owner knows what they need in terms of security, network, and remote disk, you could also make the case that because these things are limited resources placed on the corporate network for shared use, a higher level than the application provisioning tool should be handling these configurations.

Interestingly, in many of these cases the real work is integrating the automation of the tool in question with your overall processes. One of the last projects I worked on at F5 Networks was to call their BIG-IQ product’s APIs and tell it to do what I needed when I needed it as part of a larger project. This is pretty standard for the orchestration piece of the automation puzzle, and the existence of these types of APIs explain the move by application provisioning vendors to put this control into their systems.

Let’s stop for a moment and talk about what we need to have in place to build a HAL like control layer. There is a combination of pieces that can be divided numerous ways (and let me tell you, writing this I worked on graphics or whiteboard drawings to reflect most of those ways). Assume in the following diagrams that we are not simply talking about deployment, we are also talking about upgrades and re-deployment to recover post hardware error or software instability. That simplifies the drawings enough that I can fit them into a blog, and is useful for our discussion.

Assume further that in this diagram there is also a “Public Cloud” section that merely has the top part of the private cloud – with no infrastructure on site and in the realm of operations’ responsibility, it is the part beginning with “Instance spin up”, but otherwise the required steps are the same.

In an attempt to keep this image consumable, you will notice that I ignored the differences in configuration between VM and Container provisioning. There are differences, but there are more similarities from a spin-up perspective – server provisioning products like Cobbler and Stacki treat both as servers, for example. Truth be told, from an operations perspective containers lie somewhere between cloud (pick a pre-built image and spin it up) and VM (install an image and the apps that run on it). It should have its own stack in the diagram, but you can see it was getting rather tight, and since it shares traits with the other two, I decided to lump it in with one of them.

Those who are familiar with Cloud and VM both will take issue with my use of “OS Provisioning” for both – they use entirely different mechanisms to achieve that step – but they both do need to have configuration done on the OS, so I chose to include the step. A cloud image needs to have its IP and connections and storage all set up, some pre-built cloud images actually take a lot of post-spin-up configuration, depending upon the purpose of the image and what technologies it incorporates. So while on the VM side provisioning includes OS install and configuration, on the cloud side it involves image spin up and configuration.

And even this image doesn’t give us the full picture for data center automation. If we shrink this image down to a box, we then can use the following to depict the overall architecture of a total automation solution:

In this diagram, “Server Provisioning” is the entire first diagram of this blog, and the other boxes are external items that need configuration – NAS or SAN disk creation (or cloud storage allocation), Application security policy and network security configuration, and the overall network (the subnet config, VLAN config and inter-VLAN routing, etc). These things could be kept in the realm of manual automation because they don’t generally change as much as the servers utilizing them, but they can be automated today… The question is if it’s worth it in your environment, and I don’t have those answers, of course, you do.

We’re moving more and more in this direction, where you as an administrator, ops, or devops person will say “New project. Give me X amount of disk, Y ports on a VLAN, apply these security policies, and allocate Z servers to it, two as web servers, the rest as application servers with this engine.” And that will be it, the environment will spin up. Long term the environment will spin up in spite of errors, but short term, the error correction facility will be that subset known in some other great sci fi books as meatware.

What can you do to prepare for this future? Well, the best first step is to get server provisioning (first with hardware and VMs – because they’re basically the same, and someone will always spin up the hardware) down, then get it down with Cloud and Docker. Finally become an expert on one of the application provisioning tools. In essence, the contents of that first diagram are very real today, while the bits added in the second are evolving rapidly as you read this, so work on what’s real today to increase understanding and speed adoption. It helps that doing so will (after implementation) free up some time.

Of course I have my preferences for what you should learn (I DO work for a hardware/server provisioning vendor after all), but I would refer you to my DevOps.com articles linked above for a more balanced look at what might suit your needs if you’re not already started down this path.

The other thing you can do is start to look at logging and monitoring facilities. They will be an integral part of any solution you begin to look at – you cannot resolve problems on systems that just sprouted up on demand unless you can review the logs and see what went wrong. In an increasingly complex environment, that is more true than ever. I’ve seen minor hardware issues bury an entire cluster, and without log analysis, that would have been hard to track down.

It’s getting to be a fun time in the datacenter. Lots of change, thankfully much of it for the better!

Read the original blog entry...

More Stories By Don MacVittie

Don MacVittie is founder of Ingrained Technology, A technical advocacy and software development consultancy. He has experience in application development, architecture, infrastructure, technical writing,DevOps, and IT management. MacVittie holds a B.S. in Computer Science from Northern Michigan University, and an M.S. in Computer Science from Nova Southeastern University.

Latest Stories
As many know, the first generation of Cloud Management Platform (CMP) solutions were designed for managing virtual infrastructure (IaaS) and traditional applications. But that's no longer enough to satisfy evolving and complex business requirements. In his session at 21st Cloud Expo, Scott Davis, Embotics CTO, explored how next-generation CMPs ensure organizations can manage cloud-native and microservice-based application architectures, while also facilitating agile DevOps methodology. He expla...
SYS-CON Events announced today that Synametrics Technologies will exhibit at SYS-CON's 22nd International Cloud Expo®, which will take place on June 5-7, 2018, at the Javits Center in New York, NY. Synametrics Technologies is a privately held company based in Plainsboro, New Jersey that has been providing solutions for the developer community since 1997. Based on the success of its initial product offerings such as WinSQL, Xeams, SynaMan and Syncrify, Synametrics continues to create and hone inn...
To get the most out of their data, successful companies are not focusing on queries and data lakes, they are actively integrating analytics into their operations with a data-first application development approach. Real-time adjustments to improve revenues, reduce costs, or mitigate risk rely on applications that minimize latency on a variety of data sources. In his session at @BigDataExpo, Jack Norris, Senior Vice President, Data and Applications at MapR Technologies, reviewed best practices t...
"Evatronix provides design services to companies that need to integrate the IoT technology in their products but they don't necessarily have the expertise, knowledge and design team to do so," explained Adam Morawiec, VP of Business Development at Evatronix, in this SYS-CON.tv interview at @ThingsExpo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
DevOps promotes continuous improvement through a culture of collaboration. But in real terms, how do you: Integrate activities across diverse teams and services? Make objective decisions with system-wide visibility? Use feedback loops to enable learning and improvement? With technology insights and real-world examples, in his general session at @DevOpsSummit, at 21st Cloud Expo, Andi Mann, Chief Technology Advocate at Splunk, explored how leading organizations use data-driven DevOps to clos...
"I focus on what we are calling CAST Highlight, which is our SaaS application portfolio analysis tool. It is an extremely lightweight tool that can integrate with pretty much any build process right now," explained Andrew Siegmund, Application Migration Specialist for CAST, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
Recently, REAN Cloud built a digital concierge for a North Carolina hospital that had observed that most patient call button questions were repetitive. In addition, the paper-based process used to measure patient health metrics was laborious, not in real-time and sometimes error-prone. In their session at 21st Cloud Expo, Sean Finnerty, Executive Director, Practice Lead, Health Care & Life Science at REAN Cloud, and Dr. S.P.T. Krishnan, Principal Architect at REAN Cloud, discussed how they built...
The dynamic nature of the cloud means that change is a constant when it comes to modern cloud-based infrastructure. Delivering modern applications to end users, therefore, is a constantly shifting challenge. Delivery automation helps IT Ops teams ensure that apps are providing an optimal end user experience over hybrid-cloud and multi-cloud environments, no matter what the current state of the infrastructure is. To employ a delivery automation strategy that reflects your business rules, making r...
The past few years have brought a sea change in the way applications are architected, developed, and consumed—increasing both the complexity of testing and the business impact of software failures. How can software testing professionals keep pace with modern application delivery, given the trends that impact both architectures (cloud, microservices, and APIs) and processes (DevOps, agile, and continuous delivery)? This is where continuous testing comes in. D
Modern software design has fundamentally changed how we manage applications, causing many to turn to containers as the new virtual machine for resource management. As container adoption grows beyond stateless applications to stateful workloads, the need for persistent storage is foundational - something customers routinely cite as a top pain point. In his session at @DevOpsSummit at 21st Cloud Expo, Bill Borsari, Head of Systems Engineering at Datera, explored how organizations can reap the bene...
No hype cycles or predictions of a gazillion things here. IoT is here. You get it. You know your business and have great ideas for a business transformation strategy. What comes next? Time to make it happen. In his session at @ThingsExpo, Jay Mason, an Associate Partner of Analytics, IoT & Cybersecurity at M&S Consulting, presented a step-by-step plan to develop your technology implementation strategy. He also discussed the evaluation of communication standards and IoT messaging protocols, data...
In a recent survey, Sumo Logic surveyed 1,500 customers who employ cloud services such as Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform (GCP). According to the survey, a quarter of the respondents have already deployed Docker containers and nearly as many (23 percent) are employing the AWS Lambda serverless computing framework. It’s clear: serverless is here to stay. The adoption does come with some needed changes, within both application development and operations. Tha...
Digital transformation is about embracing digital technologies into a company's culture to better connect with its customers, automate processes, create better tools, enter new markets, etc. Such a transformation requires continuous orchestration across teams and an environment based on open collaboration and daily experiments. In his session at 21st Cloud Expo, Alex Casalboni, Technical (Cloud) Evangelist at Cloud Academy, explored and discussed the most urgent unsolved challenges to achieve f...
With tough new regulations coming to Europe on data privacy in May 2018, Calligo will explain why in reality the effect is global and transforms how you consider critical data. EU GDPR fundamentally rewrites the rules for cloud, Big Data and IoT. In his session at 21st Cloud Expo, Adam Ryan, Vice President and General Manager EMEA at Calligo, examined the regulations and provided insight on how it affects technology, challenges the established rules and will usher in new levels of diligence arou...
In his general session at 21st Cloud Expo, Greg Dumas, Calligo’s Vice President and G.M. of US operations, discussed the new Global Data Protection Regulation and how Calligo can help business stay compliant in digitally globalized world. Greg Dumas is Calligo's Vice President and G.M. of US operations. Calligo is an established service provider that provides an innovative platform for trusted cloud solutions. Calligo’s customers are typically most concerned about GDPR compliance, application p...