The AWS Pricing ‘API’: A kafkaesque nightmare in JSON format

Today, I have been working with the AWS Pricing API.

Well, I say “working”. What I’ve actually been doing is mashing my face into the keyboard in-between sobs of unrestrained depression and screams of literal horror.

You see, the AWS Pricing API is not very good. It’s not even, strictly speaking, an API. Allow me to introduce you to it.

The API – well, the collection of static JSON documents – can be found at the following URL

That document is, basically, a list of links to other JSON documents that describe the pricing structures of various AWS services. Now, if you crack that file open in, say, Notepad, you might spot a problem, but the problem is much more clear when we load it into PowerShell.

$indexJSON = irm 
$indexJSON.offers | measure | select Count


Errrrr… what? Only one offer? That’s a bit weird. I was expecting a bunch of offers. If I look in the file, there are thirteen of them OH WAIT… what if I inspect the properties of this object?

$indexJSON.offers | gm | select -property Name, MemberType

Name                MemberType
----                ----------
Equals                  Method
GetHashCode             Method
GetType                 Method
ToString                Method
AmazonCloudFront  NoteProperty
AmazonDynamoDB    NoteProperty
AmazonEC2         NoteProperty
AmazonElastiCache NoteProperty
AmazonGlacier     NoteProperty
AmazonRDS         NoteProperty
AmazonRedshift    NoteProperty
AmazonRoute53     NoteProperty
AmazonS3          NoteProperty
AmazonSES         NoteProperty
AmazonSimpleDB    NoteProperty
AmazonVPC         NoteProperty
awskms            NoteProperty

Oh no. No no no no no. No. Really? No. No way. No. I mean… fuck. No.

Each of the offers is exposed as a property. There’s no array there I can simply iterate over. This is not good. How am I going to do this?

Don’t panic. There’s a way. There’s totally a way. What if I… loop through the collection of properties? Yeah, that’ll work.

$indexJSON.offers | gm | ? {$_.MemberType -eq "NoteProperty"} | `
             select -property Name, Definition | % {
        # get each Service JSON doc
        $name = $_.Name
        Write-Host "Found property $name"

Found property AmazonCloudFront 
Found property AmazonDynamoDB 
Found property AmazonEC2 
Found property AmazonElastiCache 
Found property AmazonGlacier     
Found property AmazonRDS
Found property AmazonRedshift
Found property AmazonRoute53
Found property AmazonS3
Found property AmazonSES 
Found property AmazonSimpleDB
Found property AmazonVPC 
Found property awskms

OK. That works. It’s ugly as hell, but it works. Now for each of those properties, I need to extract the endpoint.

$indexJSON.offers | gm | ? {$_.MemberType -eq "NoteProperty"} | `
             select -property Name, Definition | % {
    # get each Service JSON doc
    $name = $_.Name
    Write-Host "Found property $name"
    $endpoint = $indexJSON.offers.$name.versionIndexUrl
    Write-Host "Found endpoint $endpoint"
    iwr "$endpoint" -outfile ".\json\$name.json"

It was at this point that I believe I started to dribble a bit out of one side of my mouth. That embedded variable right in the middle of the $endpoint = line broke something deep inside my brain. But I’d barely even started.

After getting the endpoints for each of these offers, I had to repeat the same procedure to extract the latest offer endpoint for each service. Because there is more than one offer archived in this ‘API’. To download the latest offer document for each service, I had to do the same thing. Here’s the entire function I ended up with

Function Load-Pricing
    # grab the index and cache it on disk
    iwr -outfile .\json\pricing.json 

    # load it
    $pricingindex = gc .\json\pricing.json | convertfrom-JSON 

    $pricingindex.offers | gm | ? {$_.MemberType -eq "NoteProperty"} | select -property Name, Definition | % {
        # get each Service JSON doc
        $name = $_.Name
        Write-Host "Found property $name"
        $endpoint = $pricingindex.offers.$name.versionIndexUrl
        Write-Host "Found endpoint $endpoint"
        iwr "$endpoint" -outfile ".\json\$name.json" 

        # now, crack that open and find the URL for the latest offer
        $currentOffers = gc ".\json\$name.json" | ConvertFrom-Json
        $c = $currentOffers.CurrentVersion
        $latestversionendpoint = $currentOffers.versions.$c.offerVersionUrl
        Write-Host "Downloading $latestversionendpoint"

        #download the actual offer file
        iwr "$latestversionendpoint" -outfile ".\json\current-$name.json" 

When I ran this, the drool started coming out of both sides of my mouth and my eyelid began to twitch uncontrollably.

But it wasn’t over yet. This just got me the latest pricing for each service and dumped it onto the disk.

What I had to do now was actually load up and parse the data I’d grabbed.

I think you can see where this is going.

$ec2data = gc .\json\current-AmazonEC2.json | ConvertFrom-Json

This file is about 44MB in size. The eyelid was by now vibrating at about 35kHz, my left leg was twitching uncontrollably and my mouth was spouting uncouth gibberish

$ec2data.products | measure

Count    : 1
Average  : 
Sum      : 
Maximum  : 
Minimum  : 
Property :


$ec2data.products | Get-Member | Measure

Count    : 9703
Average  : 
Sum      : 
Maximum  : 
Minimum  : 
Property :


Yes, for the want of a couple of simple square brackets and some forethought, the Amazon Pricing ‘API’ has managed to create a PowerShell object with nine thousand seven hundred and three named properties.

In horror I crushed my mobile phone and yelled “IT’S OVER NINE THOUSAAAAAAAAAAND!!!!!”

Now, I’m not sure how this would go in other languages. Perhaps Node.js or Python would have no problem with this. I doubt it, knowing what I do about how stuff works, but maybe. I have to hold out some hope.

But, the point is this: By formatting the ‘API’ in this way and somehow convincing me to access it via PowerShell, AWS have finally driven me off the high-dive board of madness and into the plunge-pool of stark-bollock insanity. I am fundamentally broken and it’s all AWS’s fault.

And I haven’t even started filtering this data to find the specific values I was looking for yet. There, it gets even worse. Because the document is fundamentally non-hierarchical. Product information lives under a property called $document.products – which has the now-recurrent cram-everything-into-a-property problem, making it impossible to filter cleanly. And then once you have that, you need to cross-reference off to another part of the tree, $document.terms, also with the insane everything-is-a-damn-property problem to find your actual price, when that information could have just been nested underneath the product at $document.products[x].terms[y], resulting in cleaner JSON and a document that could actually be queried in sane terms. It’s as if whoever came up with this was trying to make it an absolutely nightmare to use.

If you need me, I’ll be in the padded cell. Just follow the screaming.


2 replies on “The AWS Pricing ‘API’: A kafkaesque nightmare in JSON format”

  1. Victor says:

    May this can help you:
    Now it supports EC2 and RDS pricing, but will provide everyone pricing soon.

  2. Anthony says:

    Haha I just experienced this myself. If it makes you feel any better it’s now up to 10781 products!

Leave a Reply

Your email address will not be published. Required fields are marked *