Tag Archives: pester

The Birthday Paradox and testing random functions

I had cause, recently, to write a little function that randomises a name for a resource from a list of roughly 90 words and a 3-digit postfix, in the form ‘happycat-267′*. Now, I don’t want to re-use a name that’s gone before, so the function checks already-deployed resources and if the name is taken, the function recurses to pick a new name from my ~90,000 possiblities.

Of course, when it came to testing this function, I wanted to be reasonably sure that the recursion worked and would continue to work in the future, so I decided to write a simple Pester test that loops a certain number of times and doesn’t return a duplicate value.

But this raised a question. How many iterations should this test go through to be reasonably sure of hitting a duplicate and thus triggering a recursion?

Obviously to fully test, I’d need to run the function 90,000+ times. But that’s computationally expensive and would slow down my testing. I don’t want to do that. But how many is the least number of tests I can do to be reasonably sure of a duplicate appearing?

Which brings me to the Birthday Problem, aka the Birthday Paradox.

Stated simply, the Birthday Problem tells us that in a group of only 23 people, it’s more likely than not to find two people who share a birthday. And if you have 70 people in a room, the probability of at least one birthday match is up at 99.9%.

The Wikipedia article goes into a lot of detail on why this is mathematically true, but the astute among us whill have noticed that this mathematical phenomenon has an application in unit testing my little “random-name-without-collisions” function.

There are Birthday Problem calculators online, such as this one, so I plugged in my possibility space of 90,000 and started playing with iterations. It turns out that if I call my function 1000 times, there’s a >99.5% chance that I’ll produce (and therefore handle) a duplicate.

At 500 iterations, the probability drops to around 75%, and at 300 iterations, it’s around 40% – so clearly 300 iterations is too low, that is it’s more likely that I won’t hit a duplicate.

So, I wrote my own function to calculate the probabilities for me

I can call this in a loop, and calculate a table of probabilities from 1 iteration to 10,000 (and graph that, if I feel like it).

I can then use that table to zoom in on a probability I feel comfortable with. Let’s say I’m happy with a 99% probability of collision, it turns out that 911 iterations will get me there. If I’m happy with 90%, 644 iterations will do it.

Above 911 iterations, the curve plateaus out and the returns from adding more iterations become smaller. We hit a point where PowerShell rounds the probability up to 1 at 2519 iterations. It’s not mathematically certain at this point, but we’re up around a 99.9999999999999% chance of collision.

So we can see there’s really not much point iterating above 2500 repetitions. The increased probability just isn’t worth the extra processor cycles.

So anyway, with a little mocking, I can write a useful test that has a 99% chance of hitting at least one duplicate in the function, and test that it doesn’t actualy return any dupes, thus:

Anyway, this is what I spent yesterday afternoon researching and playing around with. Hopefully someone other than me will find it useful. If not, well I had fun.

 

* not one of the actual values.

 

Extending Pester for fun and profit

Of late, I’ve been working on a little side project to test a rather complex Akamai Property. We wanted to be confident, after making changes, that the important bits were still working as we expected them to, and for some reason there was no easy, automated solution to test this.

Obviously I decided I’d write my testing project in Pester, and so it was that I began writing a whole bunch of tests to see what URLs returned what status code, which ones redirected, which ones were cache hits and cache misses and what headers were coming back.

First up, I wrote a generic function called “Invoke-AkamaiRequest”. This function would know whether we were testing against Staging or production, and would catch and correct PowerShell’s error behaviour – which I found undesirable – and allow us to send optional Akamai pragma headers (I’ll share this function in a later post).

With that up and running, I could start writing my tests. Here’s a set of simple examples

Now, that last one, testing a 301, is interesting. Not only do you need to test that a 301 or 302 status code is coming back, you also need to test where the redirect is sending you. So I started to write tests like this

And this worked fine. But it was a bit clunky. If only Pester had a RedirectTo assertion I could just throw in there, like so

If. Only.

Oh, but it can!

Yes, you can write custom assertions for Pester. They’re pretty easy to do, too. What you need is a trio of functions describing the logic of the test, and what to return if it fails in some way. They are named PesterAssertion, PesterAssertionFailureMessage and NotPesterAssertionFailureMessage, where Assertion is the assertion name, in my case “RedirectTo”

For my particular case, the logic was to take in an HTTP response object, and check that the status was 301 (or 302), and match the Location: header to a specified value. Pretty simple really. Here’s the basic code:

I put these into my supporting module (not into the Pester module) and ran my tests. Happy happy days, it worked perfectly. Throwing different URLs at it resulted in exactly the behaviour I wanted.

All that remained was to make the failure messages a little smarter and make the Not assertion more useful, but I figured before I did that I should write this little post with the nice clean code before the extra logic goes in and makes everything unreadable.

You can probably think of several ways you could streamline your tests with assertions right now. I’ve also written RedirectPermanently and ReturnStatus assertions, and I’m looking into HaveHeaders and BeCompressed. I may even release these as an add-on module at some point soon.

You can probably think of things that should go right back into the Pester codebase, too. And there are a number of other ways you can customise and extend pester to fit your own use cases.

To summarise: Pester is not just a flexible and powerful BDD framework for PowerShell. It’s also easily extensible, adding even more power to your PowerShell toolbox.

Now get out there and test stuff.

Blog Update 12/11/15

Sorry I haven’t been posting a lot lately. I’ve been moving house – well, moving two houses – and things have been rather hectic. Hopefully I’ll be properly set up soon and can get on to regular content creation, including some screencast material.

Upcoming talks from Me:

Sydney DevOps Meetup Nov 19th 2015What DevOps Means To Domain. Well, it’s what DevOps means at Domain as well as what DevOps means to Domain. I’ll run through how we Define the DevOps Ethos and some of the results we’ve produced.

This is a short-form talk and will be kind-of ad-hoc, with an Ask-Me-Anything at the end

PowerShell Australia Meetup 26th Nov 2015Unit Testing PowerShell with Pester. A rapid introduction to using Pester to automagically test your PowerShell code, and why you should be doing this, NOW.

This one will be accompanied by Ben Hodge talking about DSC, Kirk Brady telling us why we should be using git and how to do that, and then me blathering about Pester for probably far too long once everyone is tired. Beer and Pizza are, I believe, sponsored.

 

Unit Testing Functions that return random values with pester

Pester testing – and unit testing in general – is interesting. Take, for example, this scenario

Yep. Unit testing Functions which are designed to return a random value is most tricky. Take, for example, a Function I knocked up a little while ago that’s meant to return a random date and time during working hours in the following week.

Now, I am not 100% sure as I write this blog whether or not I’ve screwed up this function completely. Luckily, I’m using Pester, so I can test it. But because it returns a random value, this makes things a bit… tricky. You may be getting a regressed-to-the-mean middle result while your test runs, but out in the wild you may be returned an outlier and suddenly your function is causing all manner of screw-ups.

Continue reading →