I was asked a couple weeks ago by Adam Bertram (@abertram) on twitter for any info on why one would want to use TDD with Pester. I have written a couple posts on HOW to use pester and I'm sure I mentioned TDD but I really don't recall ever seeing any posts on WHY one would use TDD. I think that's a fascinating question. I have not been writing much powershell at all these days but these questions are just as applicable to infrastructure code written in ruby I have been writing. I have alot of thoughts on this subject but I'd like to expand the question to an even broader scope. Why use pester (or any unit testing framework) at all? Really? Unit tests for a "scripting" language?

We are living in very interesting times. As "infrastructure as code" is growing in popularity we have devs writing more systems code and admins/ops writing more and more sophisticated scripts. In some windows circles, we see more administrators learning and writing code who have never scripted before. So you have talented devs that don't know layer 1 from layer 2 networking and think CIDR is just something you drink and experienced admins who consider share point a source control repository and have never considered writing tests for their automation.

I'm part of the "dev" group and have no right to judge here. I believe god placed cidr calculators on the internet (thanks god!) for calculating IP ranges and wikipedia for a place to lookup the OSI model. However, I'm fairly competent in writing tests and believe the discovery of TDD was a turning point in my becoming a better developer.

So this post is a collection of observations and thoughts on testing "scripts". I'm intentionally surrounding scripts in quotes because I'm finding that one person's script quickly become a full blown application. I'll also touch on TDD which I am passionate about but less dogmatic on the subject than I once was.

Tests? I ran the code and it worked. There's your test!

Don't dump your friction laden values on my devops rainbow. By the time your tests turn green, I've shipped already and am in the black. I've encountered these sentiments both in infrastructure and more traditional coding circles. Sometimes it is a conscous inability to see value in adding tests but many times the developers just are not considering tests or have never written test code. One may argue: Why quibble over these implementation details? We are taking a huge, slow manual process that took an army of operators hours to accomplish and boiling it down to an automated script that does the same in a few minutes.

Once the script works, why would it break? In the case of provisioning infrastructure, many may feel if the VM comes up and runs its bootstrap installs without errors, extra validations are a luxury.

Until the script works, testing it is a pain

So we all choose our friction. The first time we run through our code changes, we think we'll manually run it, validate it and then move on. Sounds reasonable until the manual validations prove the code is broken again and again and again. We catch ourselves rerunning cleanup, rerunning the setup, then rerunning our code and then checking the same conditions. This gets old fast and gets even worse when you have to revisit it a few days or weeks later. Its great to have a harness that will setup, run the exact same tests and then clean up - all by invoking a single command.

No tests? Welcome to Fear Driven Development

Look, testing is really hard. At least I think so. I usually spend way more time getting tests right and factored than whipping out the actual implementation code. However, whenever I am making changes to a codebase, I am so relieved when there are tests. Its my safety net. If the tests were constructed skillfully, I should be able to rip things apart and know that things are not deployable from all the failing tests. I may need to add, change or remove some tests to account for my work but overall, as those failing tests go green, its like breadcrumbs leading me back home to safety.

But maybe there are no tests. Now I'm scared and I should be and if you are on my team then you should be too. So I have a "sophie's" choice: write tests now or practice yet another time honored discipline - prayer driven development - sneaking in just this one change and hoping some manual testing gets me through it.

I'm not going to say that the former is always the right answer. Writing tests for existing code can be incredibly difficult and can make a 5 minute bug fix turn into a multi day yak hair detangling session even when you focus on just adding tests for the code you are changing. Sometimes it is the right thing to invest this extra time. It really depends on context, but I assure you the more one takes the latter road, the more dangerous the code becomes to change. The last thing you want in your codebase is to be afraid to change it unless it all works perfectly and its requirements are immutable.

Your POC will ship faster with no tests

Oh shoot, we shipped the POC. (You are likely saying something other than "shoot").

This may not always be the case, but I am pretty confident that a MVP (minimal viable product) can be completed more quickly without tests. However, v2 will be slower, v3, even slower. v4 and on will likely be akin to death marches and you probably hired a bunch of black box testers to test the features reporting bugs well after the developer has mentally moved on to other features. As the cyclomatic complexity of your code grows, it becomes nearly impossible to test all conditions affected by recent changes let alone remember them.

TIP: A POC should be no more than a POC. Prove the concept and then STOP and do it right! Side note: Its pretty awesome to blog about this and stand so principled...real life is often much more complicated...ugh...real life.

But infrastructure code is different

Ok. So far I don't think anything in this post varies with infrastructure code. As far as I am concerned, these are pretty universal rules to testing. However, infrastructure code IS different. I started the post (and titled it) referring to Pester - a test framework written in and for PowerShell. Chances are (though no guarantees) if you are writing PowerShell you are working on infrastructure. I have been focusing on infrastructure code for the past 3 to 4 years and I really found it different. I remain passionate about testing, but have embraced different patterns, workflows and principles since working in this domain. And I am still learning.

If I mock the infrastructure, what's left?

So when writing more traditional style software projects (whatever the hell that is but I don't know what else to call it), we often try to mock or stub out external "ifrastructureish" systems. File systems, databases, network sockets - we have clever ways of faking these out and that's a good thing. It allows us to focus on the code that actually needs testing.

However if I am working on a PR for the winrm ruby gem that implements the winrm protocol or I am working on provisioning a VM or am leaning heavily on something that uses the windows registry, if I mock away all of these layers, I may fall into the trap where I am not really testing my logic.

More integration tests

One way in which my testing habits have changed when dealing with infrastructure code is I am more willing to sacrifice unit tests for integration style tests. This is because there are likely to be big chunks of code that may have little conditional logic but is instead expending its effort just moving stuff around. If I mock everything out I may just end up testing that I am calling the correct API endpoints with the expected parameters. This can be useful to some extent but can quickly start to smell like the tests just repeat the implementation.

Typically I like the testing pyramid approach of lots and lots of unit tests under a relatively thin layer of integration tests. I'll fight to keep that structure but find that often the integration layer needs to be a bit thicker in the infrastructure domain. This may mean that coverage slips a bit at the unit level but some unit tests just don't provide as much value and I'm gonna get more bang for my buck in integration tests.

Still - strive for unit tests

Having said I'm more willing to skip unit tests for integration tests, I would still stress the importance of unit tests. Unit tests can be tricky but there is more often than not a way to abstract out the code that surrounds your logic in a testable way. It may seem like you are testing some trivial aspect of the code but if you can capture the logic in unit tests, the tests will run much faster and you can iterate on the problem more quickly. Also bugs found in unit tests lie far closer to the source of the bug and are thus much easier to troubleshoot.

Virtues of thorough unit test coverage in interpreted languages

When working with compiled languages like C#, Go, C++, Java, etc, it is often said that the compiler acts as Unit Test #1. There is alot to be said for code that compiles. Well there is also great value in using dynamic languages but one downside in my opinion is the loss of this initial "unit test". I have run into situations both in PowerShell and Ruby where code was deployed that simply was not correct. Using a misspelled method name or referencing an undeclared variable just to name a couple possibilities. If anything, unit tests that do no more than merely walk all possible code paths can protect code from randomly blowing up.

How about TDD?

Regardless of whether I'm writing infrastructure code or not, I tend to NOT do TDD when I am trying to figure out how to do something. Like determining which APIs to call and how to call them. How can I test for outcomes when I have no idea what they look like. I might not know what registry tree to scan or even if the point of automation is controlled at all by the registry.

Well with infrastructure code I find myself in more scenarios where I start off having no idea how to do something and the code is a journey of figuring this out. So I'm probably not writing unit tests until I figure this out. But when I can, I still love TDD. I've done lots of infrastructure TDD. Really does not matter what the domain is, I love the red, green, refactor workflow.

If you can't test first, test ASAP

So maybe writing tests first in some cases does not make sense as you hammer out just how things work. Once you do figure tings out either refactor what you have with tests first or fill in with tests after the initial implementation. Another law I have found to equally apply to all code is that the longer you delay writing the tests, the more difficult (or impossible) it is to write the tests. Some code is easy to test and some is not. When you are coding with the explicit intention of writing tests, you are motivated to make sure things are testable.

This tends to also have some nice side effects of breaking down the code into smaller decoupled components because its a pain in the but to test monoliths.

When do we NOT write tests

I don't think the answer is never. However, too often than not "throw away code" is not thrown away. Instead it grows and grows. What started as a personal utility script gets committed to source control, distributed with our app and depended on by customers. So I think we just need to be cautious to identify these inflection points as soon as possible when our "one-off" script becomes a core routine of our infrastructure.

I've encountered a few scenarios now where I need to perform different logic based on whether or not I am running locally or remotely. One scenario may be that I have just provisioned a new machine from a base image that has a generic bootstrap password and I need to change the root/admin password. If I do this in an SSH or WinRM/PowerShell session, it will likely kill the process I am running in (of course I could schedule it to happen later). Another scenario may be I'm about to install some Windows updates. I've blogged before about how this does not work over WinRM. So I know I will need to perform that action in a scheduled task.

Depending on your OS and the type of process you are living in (ruby, powershell, ssh, winrm, etc) there are different techniques to detect if you are remote or local and some are more friendly than others.

Detecting SSH

There are a couple ways I have done this. One is language independent. Its probably os independent too but I have only done this on Linux. There are a couple environment variables set in most SSH sessions so you may be able to simply check if that exists:

if ENV('SSH_CLIENT')
  # your crazy remote code here
end

I've run into problems with this though. Lets say you start a service in a SSH session. Now the process in that service has inherited your environment variables. So even if you logout and the process continues to run outside of the initial SSH context, using the above code will still trigger the remote logic.

Is the console associated with a terminal (tty)?

Most languages have a friendly way to determine if the console has a TTY. Check out this wiki that provides snippets for several different programming languages. Ive done this in ruby using:

if STDOUT.isatty
 puts 'I am remote...yo'
end

Detecting a powershell remoting session

Note that WinRM or powershell remoting do not associate a TTY with the console like SSH does; so you need to take a different approach here. If you are lucky enough to be running in a powershell remoting session and NOT a vanilla WinRM session (these are similar but different), there is an easy way to tell if you are local or being invoked from a remote powershell session:

if($PSSenderInfo -ne $null) { Write-Output 'I am remote' }

$PSSenderInfo contains metadata about one's remote session, like the winrm connection string.

Detecting a non-powershell winrm session

There are many reasons to dislike winrm and I regret to give you one more. I googled the "heck" out of this when I was trying to figure out how to do this and I assure you words much more harsh than "heck" flowed freely from my dry, parched lips. I found nothing. The solution I came up with was to search the process tree for an instance of winrshost.exe. All remote winrm sessions run as a child process of winrshost (god bless it).

Here is how you might do this in powershell:

function Test-ChildOfWinrs($ID = $PID) {
  if(++$script:recursionLevel -gt 20) { return $false }
  $parent = (Get-WmiObject -Class Win32_Process -Filter "ProcessID=$ID")
  $parent = $parent.ParentProcessID
  if($parent -eq $null) {
    return $false
  }
  else
  {
    try {
      $parentProc = Get-Process -ID $parent -ErrorAction Stop
    }
    catch {
      return $false
    }
    if($parentProc.Name -eq "winrshost") {return $true} 
    else {
      return Test-ChildOfWinrs $parent
    }
  }
}

This function takes a process id but defaults to the currently running process's id if none is provided. It traverses the ancestors of the process looking for winrshost.exe. One thing I found I had to do to make this "safe" is watch the recursion level and don't become infinite. Your CPU will thank you. I wrote this a year ago and am trying to remember the exact situation but I do know that somehow I hit a situation here I got caught in a circular traverse of the tree.

Here is a ruby function I have used in chef to do roughly the same thing:

def check_process_tree(parent, attr, match)
   wmi = ::WIN32OLE.connect("winmgmts://") 
   check_process_tree_int(wmi, parent, attr, match)   
end

def check_process_tree_int(wmi, parent, attr, match)
  if parent.nil?
    return nil 
  end
  query = "Select * from Win32_Process where ProcessID=#{parent}"
  parent_proc = wmi.ExecQuery(query)
  if parent_proc.each.count == 0
    return nil
  end
  proc = parent_proc.each.next
  result = proc.send(attr)
  if !result.nil? && result.downcase.include?(match)
    return proc 
  end
  if proc.Name == 'services.exe'
    return nil
  end
  return check_process_tree_int(wmi, proc.ParentProcessID, attr, match)
end

This will climb the process tree looking for any process you tell it to. So you would call it using:

is_remote = !check_process_tree(Process.ppid, :Name, 'winrshost.exe').nil?

Be careful with wmi queries and recursion. Every instance of the wmi search root creates a process. So I typically make sure to create one and reuse it.

I hope you have found this post helpful.

Hurry Up and Wait!

Tales from an automation engineer

Why TDD for PowerShell? Or why pester? Or why unit test a "scripting" language? / June 22, 2015 by Matt Wrock