Dear VMWare, please give us nice things to automate the things / by Matt Wrock

I’ve spent this week at VMWorld 2014 in San Francisco and have been exposed to a fair amount of VMWare news and products. One of my goals for the week was to talk to someone on the VMWare team about questions I have regarding their SDKs as well as to provide feedback regarding my own experiences working with their APIs. Yesterday I met with Brian Graf (@vTagion) during a “Meet the Experts” session. Brian has taken over Alan Renouf’s (@alanrenouf) previous role as technical marketing engineer focusing on automation. He was gracious enough to hear out some of my venting on this topic and asked that I follow up with an email so that he could direct these issues to the right folks in his organization. This blog post is intended to fill the role of that email in an internet indexible format.

I’m pretty new to VMWare. Having recently come from Microsoft, Hyper-V has been my primary virtualization tool. One of the attractions to my new job at CenturyLink Cloud was the opportunity to work with the VMWare APIs. I have many colleagues and acquaintances in the automation space who work almost exclusively with VMWare. VMWare has dominated the virtualization market since the beginning and I have been wanting for some time to get a better glimpse into their products and see what the hype was all about.

So for the last several months, I have been working closely with VMWare tools and APIs nearly all day every day. One of my key focus points has been developing the automation pipeline that CenturyLink Cloud uses to fire up new data centers and to bring existing datacenters under more highly automated management. This not only involves automating the build out of every server but automating the automation of those servers. That’s where me and VMWare hang out. We have been leveraging Chef and its new machine resource framework Chef Metal. So I have been doing a fair amount of Ruby development writing and refining a VSphere driver that serves as our bridge between VMWare VMs and Chef. This also includes code that ties into our testing framework and allows us to spin up VMs as new automation is committed to source control and then automatically tested to ensure what was meant to be automated is automated.

Not only am I new to VMWare, I’m also new to Ruby. For years and years I had been largely a C# developer and more recently with powershell over the last 5 years. So I may not be the most qualified voice to speak on behalf of the x-plat automation community, but I am a voice nonetheless and I have had the pleasure of interacting with several “Rubyists” and hearing their thoughts and observations on working with the VMWare APIs.

The VSphere SDK is wrought with friction

From what I can tell, there is absolutely no controversy here. Talk to any developer who has had to work with VMWare APIs around provisioning VMs and they are excited to not merely mention but pontificate upon the unfriendliness of these SDKs. I’m not talking about PowerCLI here -  more on that later but I am speaking of nearly all of the major programming language SDKs that sit on top of the VMWare SOAP service. One nice thing is that they all look nearly identical since they all sit on the same API therefore my criticism can be globally applicable. I have personally worked with the C# and mostly the Ruby based rbvmomi library.

One of the biggest pain points is the verbosity required to wire up a command. For example, to clone a VM there is quite a few configuration classes that  I have to instantiate and string together to feed into the CloneVM method. So CloneVM ends up being a very fat call but if anything goes wrong or I have not provided just the right value in one of the configuration classes, I may or may not get an actionable error message back and if not I may have to engage in quite a bit of trial and error to determine just where things went wrong. I think I understand the technical reasoning here and that this is attempting to keep the network chatter down, but frankly I don’t care. This is a solvable problem and it would be interesting to know if VMWare is looking to solve this.

Open Source solutions

I am not asking that VMWare feed me a better API. Especially in the Ruby community there are many quality developers more than willing to help. In fact a look over at github will reveal that there has been significant community effort and assistance here. 

Just use Fog

One option that many pursue in the Ruby space is using an API called Fog. This is an API that abstracts several of the popular (and not so popular) cloud and hypervisor solutions into a single API. In theory this means that I can code my infrastructure against VSphere but also leverage EC2. The API aggregates many of the underlying components that one expects to find in any of these environments like Machines, Networks, Storage, etc.

Of coarse the reality is that simply moving from one implementation to another never “just works.” Also the more you need to leverage the specific and unique strengths of one implementation, the more likely it is that you eventually need to “go native” and abandon fog. This was my fate and I also found the fog plugin model to be inherently flawed in that when I pull down the Fog ruby GEM, I had to pull down all plugin implementations built in making for a huge payload to download and install.

An apparent OSS cone of silence?

The core Ruby library, rbvmomi, the same library that Fog leverages has fairly recently been transferred to VMWare ownership. I’d be inclined to say that is a good thing. However it seems that VMWare is neither engaging with the developers trying to contribute to this library not are they releasing new bits to rubygems.org. Further, VMWare has been silent amidst requests to when a release can be expected. The last release was in December, 2013.

Unfortunately one pull request merged in January (8 months ago), has still not been released to RubyGems. This particular commit fixes a breaking change related to the popular Nokogiri library and this means that many need to pin their rbvmomi version to 1.5.5 released in 2012. Remember 2012? This not only shows a lack of desire to collaborate with and support their community but has an even more damaging effect of discouraging developers to contribute. Why would you want to contribute  to a dead repository?

I’m not saying that I believe VMWare has no interest in community collaboration and I fully appreciate the herculean effort sometimes needed to get a legal department in the company the size of VMWare to authorize this kind of collaboration but silence does not help and it is sending a bad message (well I guess no message really).

Engagement with the Ruby community is important

This community is likely small in comparison to VMWare’s bread and butter customers, but the world is changing. Ruby has an incredibly large stake in the configuration management space along with many other popular tools in the “devops” ecosystem. Puppet and Chef are both rooted in Ruby and as a Chef integrator, almost any integration between Chef and VSphere is written in Ruby. Another popular tool is Vagrant, any Vagrant plugin to support VSphere integration is going to leverage rbvmomi.

The industry is currently seeing a huge influx of involvement and interest with getting tools like these plugged into their infrastructures. I believe this will continue to become more popular and eventually those who use VMWare not because they “have to” may eventually opt for solutions that are more friendly to interact with.

Scant documentation

All of this is made worse by the fact that the documentation for the VSphere API is unacceptably sparse. VMWare does maintain a site that serves to document all of the objects, methods and properties exposed by their SDK. However these consist of one line descriptions with no in depth details or examples. Yes there is an active VMWare community but the resources they produce often do not suffice and may be difficult to find for more obscure issues.

PowerCLI is cool but it does not help me

The sense that I often get is that VMWare is trying to answer these shortcomings with its PowerCLI API – a powershell based API that I do think is awesome. The PowerCLI has succeeded in making many of the operations that take several lines of ruby or C# into one liners and it comes with great command line help. In stark contrast to the other language SDKs, almost everyone raves over PowerCLI who is able to use it in their automation pipeline. However, this is not an answer and especially so if you run either a linux or a mixed linux/windows shop (ie most people).

At Centurylink Cloud we run both windows and linux. We find that it is easiest to run all of our automation from linux as the control center to our pipeline. Its just not practical to provision a set of windows nodes to act as a gateway into VSphere. Therefore this means Power CLI is only available to me for one off scripts which is exactly what we need to automate away from.

While I’m at it, one small nit on PowerCLI. Its implementation as a powershell snapin as opposed to a module can make it more difficult to plug into existing powershell infrastructure. It’s a powershell v1 technology (we are coming up on v5) and while somewhat small, it is one of those things that can give the impression of an amateur effort. That said, PowerCLI is by no means amateur.

What am I suggesting?

First, let us know that you hear this and that you have a desire to move forward to help my community integrate with your technology. Respond to people commenting on your github repository. If you lack the legal approval to release contributions, designate one or more community members and transfer ownership to them allowing them to coordinate PRs and releases. This library is not contributing to VMWare IP its just a convenience layer sitting on top of your web services so this seems like a reasonable request. Let me also state what I don’t want. I do not want fancy GUIs or anything that waits for me to point and click. I’m working to automate every dark corner of our infrastructure so that we can survive.

Finally, I want to clarify this this post is not intended to insult anyone. My hope is that it serves as another data point to help you understand your customers and that you can consider as you plan future strategy. I’m sure there are many employees at VMWare that share my passion and want to make integration with other automation tools a better story. To those employees, I say fight fight fight and do not become complacent and you are really super cool and awesome for doing that.