Pester Mocking, ParameterFilters and Write-Output

I just spend an annoyed 45 minutes or so puzzling over a Pester test I was writing. I needed to test an if/elseif/else branching block of PowerShell code, written by someone else, which did very similar things to install powershell modules, uninstall, reinstall, or not, depending on some evaluated conditions and based on a list of modules and their versions

The final else condition was, in essence, not doing anything other than writing to the output stream to say “I’m not doing anything at this point”, so obviously I figured I could use Assert-MockCalled on Write-Output and I’d know what path the code was going down and I’d be sorted.

Anything but. Continue reading →

CloudFormation’s Update-CfnStack and “No Updates Are To Be Performed”

If you use Cloudformation with Powershell – and I do – you’ve probably run into this error message more than once.

If you’re like me, you’ll have tried to suppress it in several ways. You’ve tried to check your own template for changes, avoiding the call if there are none. And you’ve probably fallen foul of tiny, insignificant differences that CloudFormation doesn’t care about. You’ve probably done the same with your parameters and hit similar problems.

You might even have taken to adding a timestamp to a harmless field of your template so that at least something has updated. At an unnamed previous company, we’d just add a comment to the end of a launchconfiguration’s powershell script, which was enough to force an update and suppress the error. You might even be fine with just ignoring the rain of red text.

In my current project, we can’t just ignore it, as we’re driving our templates out of Octopus Deploy. “No updates are to be performed” throws and aborts our pipeline, which is annoying. But we want to catch other, genuine error messages from Update-CfnStack.

So what to do?

Well, turns out this is an object lesson in correct use of try/catch.

Continue reading →

The Birthday Paradox and testing random functions

I had cause, recently, to write a little function that randomises a name for a resource from a list of roughly 90 words and a 3-digit postfix, in the form ‘happycat-267’*. Now, I don’t want to re-use a name that’s gone before, so the function checks already-deployed resources and if the name is taken, the function recurses to pick a new name from my ~90,000 possiblities.

Of course, when it came to testing this function, I wanted to be reasonably sure that the recursion worked and would continue to work in the future, so I decided to write a simple Pester test that loops a certain number of times and doesn’t return a duplicate value.

But this raised a question. How many iterations should this test go through to be reasonably sure of hitting a duplicate and thus triggering a recursion?

Obviously to fully test, I’d need to run the function 90,000+ times. But that’s computationally expensive and would slow down my testing. I don’t want to do that. But how many is the least number of tests I can do to be reasonably sure of a duplicate appearing?

Which brings me to the Birthday Problem, aka the Birthday Paradox.

Stated simply, the Birthday Problem tells us that in a group of only 23 people, it’s more likely than not to find two people who share a birthday. And if you have 70 people in a room, the probability of at least one birthday match is up at 99.9%.

The Wikipedia article goes into a lot of detail on why this is mathematically true, but the astute among us whill have noticed that this mathematical phenomenon has an application in unit testing my little “random-name-without-collisions” function.

There are Birthday Problem calculators online, such as this one, so I plugged in my possibility space of 90,000 and started playing with iterations. It turns out that if I call my function 1000 times, there’s a >99.5% chance that I’ll produce (and therefore handle) a duplicate.

At 500 iterations, the probability drops to around 75%, and at 300 iterations, it’s around 40% – so clearly 300 iterations is too low, that is it’s more likely that I won’t hit a duplicate.

So, I wrote my own function to calculate the probabilities for me

I can call this in a loop, and calculate a table of probabilities from 1 iteration to 10,000 (and graph that, if I feel like it).

(1..10000) | % {
    $probability = Get-CollisionChance 90000 $_ 
    [pscustomobject]@{Iterations = $_ ; Probability = $probability}
}

I can then use that table to zoom in on a probability I feel comfortable with. Let’s say I’m happy with a 99% probability of collision, it turns out that 911 iterations will get me there. If I’m happy with 90%, 644 iterations will do it.

Above 911 iterations, the curve plateaus out and the returns from adding more iterations become smaller. We hit a point where PowerShell rounds the probability up to 1 at 2519 iterations. It’s not mathematically certain at this point, but we’re up around a 99.9999999999999% chance of collision.

So we can see there’s really not much point iterating above 2500 repetitions. The increased probability just isn’t worth the extra processor cycles.

So anyway, with a little mocking, I can write a useful test that has a 99% chance of hitting at least one duplicate in the function, and test that it doesn’t actualy return any dupes, thus:

Describe "Getting a random name" {
    # Mock this so we don't exhaust API calls by repeatedly calling for a list
    $currentresources = Get-AlreadyDeployedResources
    Mock Get-AlreadyDeployedResources { return $currentresources }

    It "Never returns duplicates (as far as we can tell)" {   
        $reslist = @() 
        (1..911) | % { 
            # get a random resource name
            $reslist += Get-RandomResourceName | tee-object -Variable plusone 
            # add that back to the list we mocked
            $currentresources += [pscustomobject]@{ ResourceName = $plusone; ResourceID = 'fakeidentifier' }
        }
        # count how many names we have
        $all = ($reslist | measure-object | select-object -expand Count)
        #check how many unique names are in that list
        $unique = ($reslist | select-object -unique | measure-object | select-object -expand Count)

        # they should match
        $all | Should Be $unique
    }
}

Anyway, this is what I spent yesterday afternoon researching and playing around with. Hopefully someone other than me will find it useful. If not, well I had fun.

 

* not one of the actual values.

 

Introducing Takofukku

I’ve long held the opinion that Octopus Deploy is a fantastic tool for at-scale ops tasks. My motto for a couple of years now has been “If you have a deployment target, you also have a management target”. Octopus’s amazingly flexible and nicely secured Server/Tentacle model means you’ve got a distributed task runner par excellence right there in your organisation, ready to use.

What it’s tended to lack is an easy, off-the-shelf automated trigger.

Let me explain:

The classic mode of operation for octopus is like this

Source Control --triggers-> Build Server --triggers-> Octopus.

However for most infrastructure-as-code applications, which are in interpreted languages, that build server step isn’t really needed. You could argue that “Well tests should run there”, but let’s not forget, Octopus is a distrbuted task runner already. Why not run tests in Octopus? It understands environments, so it understands branches, so you can easily separate dev/test/prod, or whatever process you use. If you’re smart, you can even get it to spin up arbitrary environments.

So back when I was at Domain Group, I set about building a little webhook server using PoSHServer to take git commits and fire Octopus, so the ops team could drive things like Robot Army in Octopus from a simple commit/push. It mapped known branches to known Octopus environments. No build server in the middle needed. A little custom code, but not much.

That was all very well but it wasn’t very usable. You had to delve into some powershell code every time you wanted to trigger Octopus for a new project. For Bash guys, that’s no good. For C# guys that may be no good. It also wasn’t very generic or well-defined.

So I started building Takofukku in my head.

Takofukku is a simple anglo-japanese portmanteau word. Tako (たこ) is “Octopus”. You may have had Takoyaki at a japanese restaurant. “Fukku” (フック) is hook. So Takofukku is an Octopus Hook – which is what this little app does.

I dithered over this app on personal time while working at Domain, wrote some PowerShell for it, scrapped it, messed around with AWS Lambda for a bit, then moved to Azure functions, started a PowerShell project, scrapped it, tried again with C#, scrapped that and generally made no concrete progress. I struggled with free time and attention span, to be frank. When you’ve been coding all day, coding when you get home is far less attractive than XBox.

Then I joined Octopus Deploy as a Cloud Architect. And still dithered over it.

Then a few weeks ago I got a Saturday entirely to myself with nothing to divert my attention. I’d  been learning F# for a bit, and was inspired by the language, so scrapped all the previous Takofukku work and rewrote the whole thing in F#.

And so, Takofukku became a thing.

Takofukku is, in essence, a webhook endpoint that understands github push events. It defines a simple file format, a takofile, that lives in the root folder of a github repository, and defines how you trigger octopus when you push. It’s not unlike, say, appveyor or travis in that respect.

So when you commit, Takofukku is triggered, it goes and grabs your takofile, figures out how to invoke Octopus, and does it, using an API key you provide privately.

Of course, I wanted this app for Ops tasks, but what it actually enables is a lot more scenarios. using Takofukku, you could use Octopus as a hillbilly CI server, letting Octopus run your builds and tests, your packaging, and your deployment. Very good for a cheapskate shop such as, say, me. Sure, Octopus isn’t designed like a build server, but when you think about it… it sort of kind of is. You could easily roll an Octopus project that spins up, say, a Docker Container after a commit, then uses that container to run tests or builds, then drops a built artifact into the Octopus Nuget feed, which then triggers a deploy. This stuff is pretty much right there out of the box. Conversely, I’ve got Octopus projects that just send me a slack message when a push happens on a known branch.

So, long story short today I made hook.takofukku.io open to the public, and the instructions to use it are at the Github readme.

Please do feel free to play with it. It’s open source, it’s free to use – will ALWAYS be free for public, open repos – and I’m open to feedback.

Do hit me up on twitter – I’m @cloudyopspoet

Cheers all!

 

 

Introducing StatusCakeDSC

I recently published a new open source PowerShell project for creating StatusCake checks and contact groups, called – logically enough – StatusCakeDSC.

If you use StatusCake, and you do Infrastructure as Code, then you may be thankful for a nice declarative way to define and manage your checks and groups. For PowerShell users like myself, that declarative syntax is generally DSC.

I happen to use StatusCake for my personal projects, which I’ve been modernising a little of late, and I figured I should just build and publish a nice little DSC resource to do what I need done. The syntax of it is pretty simple, it’s like pretty much every other DSC resource you’ve ever seen. Look

configuration StatusCakeExample
{
    Import-DscResource -ModuleName StatusCakeDSC
    node(hostname)
    {
        StatusCakeTest MyTest
        {
            Name = "D.Evops.co"
            Ensure = "Present"
            URL = "http://d.evops.co/"
            Paused = $true
            CheckRate = 50
        }

        StatusCakeContactGroup MyGroup
        {
            Ensure = "Present"
            GroupName = "My simple Contact Group"
            Email = @("test@test.com", "test2@test.com")
            PingUrl = "http://d.evops.co/ping"
        }
    }
}

StatusCakeExample -outputpath $env:tmp\StatusCake

Start-DscConfiguration -Path $env:tmp\StatusCake -Verbose -Wait -Force

That example creates a test and a contact group. There are more examples over at the repo. There’s also a simple install script, and a little further down the line there’ll be an Octopus Deploy step template and maybe some other goodies.

It’s not production-perfect, but it certainly does the basics. I’ll probably declare it ready in a week or two, at which point I’ll publish it as a package and maybe even wrap it in Chocolatey or PowerShellGet to make it even easier to install. Do feel free to let me know if it’s useful, and drop in PRs if you spot bugs or gaps.

Turn your PowerShell scripts into a Web API – three options

So, you’ve got a load of PowerShell code that does useful stuff, and because you’re doing Modern Windows Ops now, you want to port that code into some kind of web-based API, so you can scale it out and make your epic PowerShell code accessible from more devices and environments.

You don’t want to rewrite it all in C# as an ASP.NET Web Api – for an ops engineer that seems like terrible overkill. Besides, who’s got time for that nonsense? Modern Ops teams are busy as hell, even though they’ve got automation everywhere. You could get devs to do it, but then you have to manage requirements and specs and Jira tickets oh god not Jira tickets. Please anything other than Jira tickets NOOOOOOOOO!

Ahem *cough* excuse me. Where was I?

Oh yes.

If only there was a way you could take your existing PowerShell code and turn it into an HTTP API quickly and easily.

Well, it turns out there are lots of options now. Check below the fold for three I’ve used recently.

Continue reading →

Blog Update 17 May 2017. Job stuff, Wannacry and MVA

So, what’s been going on? I’ve been a bit lax in blogging here of late, which I hope to fix in the near future. So what’s the news?

Well, new item number one if that I’m about to move on from Domain Group, where I’ve been Windows DevOps Wonk for the last three years, and head to Octopus Deploy, where I’ll be doing some really exciting Cloud Architecture work and generally trying to reach as many people as possible with the good news of Modern Windows Ops.

So that’s the first thing out of the way. What else has been going on?

Oh yeah, there’s that whole WannaCry thing which went by. At Domain we were entirely unaffected. Why?

Well, most of us are running Windows 10 on our laptops, which was immune to the specific vulnerability. That was a major factor. But I don’t manage the client OS fleet. I manage the servers.

Good solid firewall practice was a major factor. SMB would never be open to the internet, and we have periodic security scanning that checks our Cloud environments against an exhaustive set of rules. We absolutely don’t allow SMB shares on our fleet – that was common practice at one time, but rapidly deemed anticloud because it does nothing except enable bad deployment practice.

However, an interesting wrinkle on the subsequent community debate: At Domain, we turn off Windows Update on our Robot Army Windows Server fleet.

“WHAAAAAT?” you say. “WHYYYY?”

There’s a specific reason for these instances. We found quite early on that occasionally Windows Update would fire off on instances, and push CPU usage to 100%, triggering scale-up events. In some cases, we’d end up with alerts and minor outages as a result of this behaviour. It also skewed the metrics we collect on resource usage by causing spikes at weird times, and was known to delay deployments

So we made a considered, reasoned decision to disable Windows Update on the autoscaling fleet. That’s a few hundred boxes right there.

As threat mitigation, we don’t allow RDP access to those boxes, we run Windows Server Core Edition with a minimal set of features enabled, and we closely monitor and correct changes to things like Security Group rules, service state and installed Windows features. All boxes are disposable at a moment’s notice, and we renew our AMI images on a regular basis – sometimes several times a week. Those images are fully patched and tested – with a suite of Pester tests – at creation time. We also maintain a small number of more “traditional” servers, which do have updates enabled, but none of these run customer-facing workloads

Make no mistake, Troy Hunt is absolutely right that no client OS should have updates disabled. But a modern server workload may have a case for it, as long as other measures are taken to protect the OS from threats. Your mileage may vary. Treat all advice with caution. I am not a doctor.

Last, here’s a new bit from Microsoft Virtual Academy (published 15 May 2017) which I think did a decent job of explaining modern DevOps practices to the curious or confused. The video and I certainly differ on some specific points of dogma, but the big picture is good – automate, tighten your feedback loops, throw away the old stuff, treat servers as software objects, move fast, apply laziness early on, build often, deploy often etc. Worth a look even if you’re a grizzled veteran, I’d say.

You might be paying too much for S3

Actually… you almost definitely are. There’s almost a hoarder’s instinct with S3 and related cloud storage. You keep things in there that you might not need any more, because it can be hard to identify what’s needed. However that’s not what I’m talking about

I’m talking about Aborted multipart uploads.

S3 includes options for multipart uploads on large files. This makes uploads more reliable, by breaking them up into smaller chunks, and makes them resumable, by starting again from a failed chunk. A lot of S3 clients have multipart built-in, so you might well be doing a multipart upload without even knowing it.

However when a multipart upload aborts and does not complete, the slot can be held open – there is literally no timeout – and it’s an object in your account, for which you’ll be charged.

Luckily, AWS provide ways to deal with this. You just have to search them out

If you’re using PowerShell, as I am, you can use the Remove-S3MultipartUploads cmdlet. If you’re using, say, node.js, you can use s3-upload-cleaner. There’s a way to clean these up in your chosen SDK, you just need to know about it and act on it.

There’s even a way to do this with an S3 bucket lifecycle policy, as explained by Jeff Barr here.

Now go forth, and stop paying too much for S3. Also, clean out the attic while you’re in there. You don’t need most of those files, do you? Hoarder.

The PowerShell Pipeline, explained

So, my previous post on PowerShell has prompted some responses, internally and externally. Sufficient that I did actually re-word some parts of it, and sufficient that I feel a need to be positive and offer something to take away the burn.

So let’s have a go at explaining the pipeline, shall we?

To do this, I’m going to give an example of doing something without the pipeline. I hope that by the end of this post, the value of showing the other way first will be clear. But I’ll say up front, if you have written code like I’m about to show, don’t fret. It still works. There’s just a better way.

The example I’ve chosen is this:

You’re deploying an IIS Web Application using PowerShell, and as part of your deployment process, you want to delete the previous web site(s) from IIS.

So, let’s dig in. I’m going to be quite thorough, and it’s fine to follow along in the PowerShell prompt. You will, of course, need IIS installed if you do, but don’t worry, at the end there’s an example or two that should work for everyone.

Continue reading →

You don’t know PowerShell

We’ve been doing a lot of interviewing at work of late. You see, we’re looking for good Windows DevOps candidates, and the local market is… well… let’s just say that next-gen Windows guys are a little thin on the ground around here.

This is a problem. Because while next-gen windows candidates are thin on the ground, resumés claiming next-gen Windows skills are emphatically not thin on the ground. In fact, I have actually – literally – lost count of the number of resumés I’ve seen where PowerShell is touted as a key skill, only for the candidate to fail some very simple PowerShell questions in the initial screening. So let’s just run through a few basic principles so that we don’t all waste each other’s time.

Continue reading →