Category Archives: Posts

Longer Posts

The Birthday Paradox and testing random functions

I had cause, recently, to write a little function that randomises a name for a resource from a list of roughly 90 words and a 3-digit postfix, in the form ‘happycat-267’*. Now, I don’t want to re-use a name that’s gone before, so the function checks already-deployed resources and if the name is taken, the function recurses to pick a new name from my ~90,000 possiblities.

Of course, when it came to testing this function, I wanted to be reasonably sure that the recursion worked and would continue to work in the future, so I decided to write a simple Pester test that loops a certain number of times and doesn’t return a duplicate value.

But this raised a question. How many iterations should this test go through to be reasonably sure of hitting a duplicate and thus triggering a recursion?

Obviously to fully test, I’d need to run the function 90,000+ times. But that’s computationally expensive and would slow down my testing. I don’t want to do that. But how many is the least number of tests I can do to be reasonably sure of a duplicate appearing?

Which brings me to the Birthday Problem, aka the Birthday Paradox.

Stated simply, the Birthday Problem tells us that in a group of only 23 people, it’s more likely than not to find two people who share a birthday. And if you have 70 people in a room, the probability of at least one birthday match is up at 99.9%.

The Wikipedia article goes into a lot of detail on why this is mathematically true, but the astute among us whill have noticed that this mathematical phenomenon has an application in unit testing my little “random-name-without-collisions” function.

There are Birthday Problem calculators online, such as this one, so I plugged in my possibility space of 90,000 and started playing with iterations. It turns out that if I call my function 1000 times, there’s a >99.5% chance that I’ll produce (and therefore handle) a duplicate.

At 500 iterations, the probability drops to around 75%, and at 300 iterations, it’s around 40% – so clearly 300 iterations is too low, that is it’s more likely that I won’t hit a duplicate.

So, I wrote my own function to calculate the probabilities for me

I can call this in a loop, and calculate a table of probabilities from 1 iteration to 10,000 (and graph that, if I feel like it).

(1..10000) | % {
    $probability = Get-CollisionChance 90000 $_ 
    [pscustomobject]@{Iterations = $_ ; Probability = $probability}
}

I can then use that table to zoom in on a probability I feel comfortable with. Let’s say I’m happy with a 99% probability of collision, it turns out that 911 iterations will get me there. If I’m happy with 90%, 644 iterations will do it.

Above 911 iterations, the curve plateaus out and the returns from adding more iterations become smaller. We hit a point where PowerShell rounds the probability up to 1 at 2519 iterations. It’s not mathematically certain at this point, but we’re up around a 99.9999999999999% chance of collision.

So we can see there’s really not much point iterating above 2500 repetitions. The increased probability just isn’t worth the extra processor cycles.

So anyway, with a little mocking, I can write a useful test that has a 99% chance of hitting at least one duplicate in the function, and test that it doesn’t actualy return any dupes, thus:

Describe "Getting a random name" {
    # Mock this so we don't exhaust API calls by repeatedly calling for a list
    $currentresources = Get-AlreadyDeployedResources
    Mock Get-AlreadyDeployedResources { return $currentresources }

    It "Never returns duplicates (as far as we can tell)" {   
        $reslist = @() 
        (1..911) | % { 
            # get a random resource name
            $reslist += Get-RandomResourceName | tee-object -Variable plusone 
            # add that back to the list we mocked
            $currentresources += [pscustomobject]@{ ResourceName = $plusone; ResourceID = 'fakeidentifier' }
        }
        # count how many names we have
        $all = ($reslist | measure-object | select-object -expand Count)
        #check how many unique names are in that list
        $unique = ($reslist | select-object -unique | measure-object | select-object -expand Count)

        # they should match
        $all | Should Be $unique
    }
}

Anyway, this is what I spent yesterday afternoon researching and playing around with. Hopefully someone other than me will find it useful. If not, well I had fun.

 

* not one of the actual values.

 

Introducing Takofukku

I’ve long held the opinion that Octopus Deploy is a fantastic tool for at-scale ops tasks. My motto for a couple of years now has been “If you have a deployment target, you also have a management target”. Octopus’s amazingly flexible and nicely secured Server/Tentacle model means you’ve got a distributed task runner par excellence right there in your organisation, ready to use.

What it’s tended to lack is an easy, off-the-shelf automated trigger.

Let me explain:

The classic mode of operation for octopus is like this

Source Control --triggers-> Build Server --triggers-> Octopus.

However for most infrastructure-as-code applications, which are in interpreted languages, that build server step isn’t really needed. You could argue that “Well tests should run there”, but let’s not forget, Octopus is a distrbuted task runner already. Why not run tests in Octopus? It understands environments, so it understands branches, so you can easily separate dev/test/prod, or whatever process you use. If you’re smart, you can even get it to spin up arbitrary environments.

So back when I was at Domain Group, I set about building a little webhook server using PoSHServer to take git commits and fire Octopus, so the ops team could drive things like Robot Army in Octopus from a simple commit/push. It mapped known branches to known Octopus environments. No build server in the middle needed. A little custom code, but not much.

That was all very well but it wasn’t very usable. You had to delve into some powershell code every time you wanted to trigger Octopus for a new project. For Bash guys, that’s no good. For C# guys that may be no good. It also wasn’t very generic or well-defined.

So I started building Takofukku in my head.

Takofukku is a simple anglo-japanese portmanteau word. Tako (たこ) is “Octopus”. You may have had Takoyaki at a japanese restaurant. “Fukku” (フック) is hook. So Takofukku is an Octopus Hook – which is what this little app does.

I dithered over this app on personal time while working at Domain, wrote some PowerShell for it, scrapped it, messed around with AWS Lambda for a bit, then moved to Azure functions, started a PowerShell project, scrapped it, tried again with C#, scrapped that and generally made no concrete progress. I struggled with free time and attention span, to be frank. When you’ve been coding all day, coding when you get home is far less attractive than XBox.

Then I joined Octopus Deploy as a Cloud Architect. And still dithered over it.

Then a few weeks ago I got a Saturday entirely to myself with nothing to divert my attention. I’d  been learning F# for a bit, and was inspired by the language, so scrapped all the previous Takofukku work and rewrote the whole thing in F#.

And so, Takofukku became a thing.

Takofukku is, in essence, a webhook endpoint that understands github push events. It defines a simple file format, a takofile, that lives in the root folder of a github repository, and defines how you trigger octopus when you push. It’s not unlike, say, appveyor or travis in that respect.

So when you commit, Takofukku is triggered, it goes and grabs your takofile, figures out how to invoke Octopus, and does it, using an API key you provide privately.

Of course, I wanted this app for Ops tasks, but what it actually enables is a lot more scenarios. using Takofukku, you could use Octopus as a hillbilly CI server, letting Octopus run your builds and tests, your packaging, and your deployment. Very good for a cheapskate shop such as, say, me. Sure, Octopus isn’t designed like a build server, but when you think about it… it sort of kind of is. You could easily roll an Octopus project that spins up, say, a Docker Container after a commit, then uses that container to run tests or builds, then drops a built artifact into the Octopus Nuget feed, which then triggers a deploy. This stuff is pretty much right there out of the box. Conversely, I’ve got Octopus projects that just send me a slack message when a push happens on a known branch.

So, long story short today I made hook.takofukku.io open to the public, and the instructions to use it are at the Github readme.

Please do feel free to play with it. It’s open source, it’s free to use – will ALWAYS be free for public, open repos – and I’m open to feedback.

Do hit me up on twitter – I’m @cloudyopspoet

Cheers all!

 

 

Turn your PowerShell scripts into a Web API – three options

So, you’ve got a load of PowerShell code that does useful stuff, and because you’re doing Modern Windows Ops now, you want to port that code into some kind of web-based API, so you can scale it out and make your epic PowerShell code accessible from more devices and environments.

You don’t want to rewrite it all in C# as an ASP.NET Web Api – for an ops engineer that seems like terrible overkill. Besides, who’s got time for that nonsense? Modern Ops teams are busy as hell, even though they’ve got automation everywhere. You could get devs to do it, but then you have to manage requirements and specs and Jira tickets oh god not Jira tickets. Please anything other than Jira tickets NOOOOOOOOO!

Ahem *cough* excuse me. Where was I?

Oh yes.

If only there was a way you could take your existing PowerShell code and turn it into an HTTP API quickly and easily.

Well, it turns out there are lots of options now. Check below the fold for three I’ve used recently.

Continue reading →

Blog Update 17 May 2017. Job stuff, Wannacry and MVA

So, what’s been going on? I’ve been a bit lax in blogging here of late, which I hope to fix in the near future. So what’s the news?

Well, new item number one if that I’m about to move on from Domain Group, where I’ve been Windows DevOps Wonk for the last three years, and head to Octopus Deploy, where I’ll be doing some really exciting Cloud Architecture work and generally trying to reach as many people as possible with the good news of Modern Windows Ops.

So that’s the first thing out of the way. What else has been going on?

Oh yeah, there’s that whole WannaCry thing which went by. At Domain we were entirely unaffected. Why?

Well, most of us are running Windows 10 on our laptops, which was immune to the specific vulnerability. That was a major factor. But I don’t manage the client OS fleet. I manage the servers.

Good solid firewall practice was a major factor. SMB would never be open to the internet, and we have periodic security scanning that checks our Cloud environments against an exhaustive set of rules. We absolutely don’t allow SMB shares on our fleet – that was common practice at one time, but rapidly deemed anticloud because it does nothing except enable bad deployment practice.

However, an interesting wrinkle on the subsequent community debate: At Domain, we turn off Windows Update on our Robot Army Windows Server fleet.

“WHAAAAAT?” you say. “WHYYYY?”

There’s a specific reason for these instances. We found quite early on that occasionally Windows Update would fire off on instances, and push CPU usage to 100%, triggering scale-up events. In some cases, we’d end up with alerts and minor outages as a result of this behaviour. It also skewed the metrics we collect on resource usage by causing spikes at weird times, and was known to delay deployments

So we made a considered, reasoned decision to disable Windows Update on the autoscaling fleet. That’s a few hundred boxes right there.

As threat mitigation, we don’t allow RDP access to those boxes, we run Windows Server Core Edition with a minimal set of features enabled, and we closely monitor and correct changes to things like Security Group rules, service state and installed Windows features. All boxes are disposable at a moment’s notice, and we renew our AMI images on a regular basis – sometimes several times a week. Those images are fully patched and tested – with a suite of Pester tests – at creation time. We also maintain a small number of more “traditional” servers, which do have updates enabled, but none of these run customer-facing workloads

Make no mistake, Troy Hunt is absolutely right that no client OS should have updates disabled. But a modern server workload may have a case for it, as long as other measures are taken to protect the OS from threats. Your mileage may vary. Treat all advice with caution. I am not a doctor.

Last, here’s a new bit from Microsoft Virtual Academy (published 15 May 2017) which I think did a decent job of explaining modern DevOps practices to the curious or confused. The video and I certainly differ on some specific points of dogma, but the big picture is good – automate, tighten your feedback loops, throw away the old stuff, treat servers as software objects, move fast, apply laziness early on, build often, deploy often etc. Worth a look even if you’re a grizzled veteran, I’d say.

You might be paying too much for S3

Actually… you almost definitely are. There’s almost a hoarder’s instinct with S3 and related cloud storage. You keep things in there that you might not need any more, because it can be hard to identify what’s needed. However that’s not what I’m talking about

I’m talking about Aborted multipart uploads.

S3 includes options for multipart uploads on large files. This makes uploads more reliable, by breaking them up into smaller chunks, and makes them resumable, by starting again from a failed chunk. A lot of S3 clients have multipart built-in, so you might well be doing a multipart upload without even knowing it.

However when a multipart upload aborts and does not complete, the slot can be held open – there is literally no timeout – and it’s an object in your account, for which you’ll be charged.

Luckily, AWS provide ways to deal with this. You just have to search them out

If you’re using PowerShell, as I am, you can use the Remove-S3MultipartUploads cmdlet. If you’re using, say, node.js, you can use s3-upload-cleaner. There’s a way to clean these up in your chosen SDK, you just need to know about it and act on it.

There’s even a way to do this with an S3 bucket lifecycle policy, as explained by Jeff Barr here.

Now go forth, and stop paying too much for S3. Also, clean out the attic while you’re in there. You don’t need most of those files, do you? Hoarder.

The PowerShell Pipeline, explained

So, my previous post on PowerShell has prompted some responses, internally and externally. Sufficient that I did actually re-word some parts of it, and sufficient that I feel a need to be positive and offer something to take away the burn.

So let’s have a go at explaining the pipeline, shall we?

To do this, I’m going to give an example of doing something without the pipeline. I hope that by the end of this post, the value of showing the other way first will be clear. But I’ll say up front, if you have written code like I’m about to show, don’t fret. It still works. There’s just a better way.

The example I’ve chosen is this:

You’re deploying an IIS Web Application using PowerShell, and as part of your deployment process, you want to delete the previous web site(s) from IIS.

So, let’s dig in. I’m going to be quite thorough, and it’s fine to follow along in the PowerShell prompt. You will, of course, need IIS installed if you do, but don’t worry, at the end there’s an example or two that should work for everyone.

Continue reading →

Extending Pester for fun and profit

Of late, I’ve been working on a little side project to test a rather complex Akamai Property. We wanted to be confident, after making changes, that the important bits were still working as we expected them to, and for some reason there was no easy, automated solution to test this.

Obviously I decided I’d write my testing project in Pester, and so it was that I began writing a whole bunch of tests to see what URLs returned what status code, which ones redirected, which ones were cache hits and cache misses and what headers were coming back.

First up, I wrote a generic function called “Invoke-AkamaiRequest”. This function would know whether we were testing against Staging or production, and would catch and correct PowerShell’s error behaviour – which I found undesirable – and allow us to send optional Akamai pragma headers (I’ll share this function in a later post).

With that up and running, I could start writing my tests. Here’s a set of simple examples

Describe "An example test, to establish things" {
    Context "Hit up the homepage." {
        It "Should return 200" {
            (Invoke-AkamaiRequest -uristem /).StatusCode | Should Be 200
        }
    }

    Context "Hit up a non-existent page" {
        It "Should return 404" {
            (Invoke-AkamaiRequest -uristem /nonexistent.html).StatusCode | Should be 404
        }
    }

    Context "A redirect works" {
        It "Should gimme 301" {
            (Invoke-AkamaiRequest -uristem /redirectedfolder/nonexist).StatusCode | Should Be 301
        }
    }
}

Now, that last one, testing a 301, is interesting. Not only do you need to test that a 301 or 302 status code is coming back, you also need to test where the redirect is sending you. So I started to write tests like this

It "Should redirect /blog/ to /advice/" {
    $blog = Invoke-AkamaiRequest -path /blog/
    ($blog | select -expand statuscode | Should Be 301) -and 
    ($blog.headers.location | Should Be http://$tld/advice/)
}

And this worked fine. But it was a bit clunky. If only Pester had a RedirectTo assertion I could just throw in there, like so

It "Should redirect /blog/ to /advice/ " {
    Invoke-AkamaiRequest -path /blog/ | Should RedirectTo http://$tld/advice/ 
}

If. Only.

Oh, but it can!

Yes, you can write custom assertions for Pester. They’re pretty easy to do, too. What you need is a trio of functions describing the logic of the test, and what to return if it fails in some way. They are named PesterAssertion, PesterAssertionFailureMessage and NotPesterAssertionFailureMessage, where Assertion is the assertion name, in my case “RedirectTo”

For my particular case, the logic was to take in an HTTP response object, and check that the status was 301 (or 302), and match the Location: header to a specified value. Pretty simple really. Here’s the basic code:

function PesterRedirectTo($value, $expected)
{
    return [bool](($value.statuscode -eq 301 -or $value.statuscode -eq 302) -and 
                    $value.headers.location -eq $expected)
}

function PesterRedirectToFailureMessage($value,$expected)
{
    return "Expected to redirect to {$expected}"
}

function NotPesterRedirectToFailureMessage($value,$expected)
{
    return "Expected not to redirect to {$expected}"
}

I put these into my supporting module (not into the Pester module) and ran my tests. Happy happy days, it worked perfectly. Throwing different URLs at it resulted in exactly the behaviour I wanted.

All that remained was to make the failure messages a little smarter and make the Not assertion more useful, but I figured before I did that I should write this little post with the nice clean code before the extra logic goes in and makes everything unreadable.

You can probably think of several ways you could streamline your tests with assertions right now. I’ve also written RedirectPermanently and ReturnStatus assertions, and I’m looking into HaveHeaders and BeCompressed. I may even release these as an add-on module at some point soon.

You can probably think of things that should go right back into the Pester codebase, too. And there are a number of other ways you can customise and extend pester to fit your own use cases.

To summarise: Pester is not just a flexible and powerful BDD framework for PowerShell. It’s also easily extensible, adding even more power to your PowerShell toolbox.

Now get out there and test stuff.

Chickens not Cattle and definitely not Pets. Or maybe Bees

A few weeks ago I was ruminating on Twitter about the “Cattle not Pets” metaphor for Cloud instances.

I started a vague blog draft on the topic, got a little sidetracked and never fully completed the thought. But it’s something that’s itched at me for some time now. Cattle do in fact get fairly individualised treatment, except on the largest of scales. So they’re not a great metaphor. But I didn’t really have a perfect replacement So when I wandered into the office this morning and checked Twitter, I was gratified to see that Jeffrey Snover had tweeted out an article ruminating on basically the same topic, and which did a lot of the agricultural thinking for me.

Continue reading →

Stop Thinking About Servers

Every so often I get a request, from one or more of our developers, for Remote Desktop access to the servers running their code – be it for troubleshooting, configuration or some other arcane purpose.

My answer is almost uniformly “no”.

WAT

“But surely,” says the cat “you’re a super-futuristic DevOps shop spoken of in breathless terms by national IT publications? Don’t you trust your developers??”

Of course we do. But…

Continue reading →

Blog Update 12/11/15

Sorry I haven’t been posting a lot lately. I’ve been moving house – well, moving two houses – and things have been rather hectic. Hopefully I’ll be properly set up soon and can get on to regular content creation, including some screencast material.

Upcoming talks from Me:

Sydney DevOps Meetup Nov 19th 2015What DevOps Means To Domain. Well, it’s what DevOps means at Domain as well as what DevOps means to Domain. I’ll run through how we Define the DevOps Ethos and some of the results we’ve produced.

This is a short-form talk and will be kind-of ad-hoc, with an Ask-Me-Anything at the end

PowerShell Australia Meetup 26th Nov 2015Unit Testing PowerShell with Pester. A rapid introduction to using Pester to automagically test your PowerShell code, and why you should be doing this, NOW.

This one will be accompanied by Ben Hodge talking about DSC, Kirk Brady telling us why we should be using git and how to do that, and then me blathering about Pester for probably far too long once everyone is tired. Beer and Pizza are, I believe, sponsored.