CS and the City Sean Lynch

Parsing JSON in Go

I got to hang out with the team at Sendwithus at their Battlesnake competition. While the actual competitors scrambled to build the mightiest of snake AIs, I took the opportunity to learn a bit more about Golang by attempting to write a snake in Go. Coming from a Python background, I never expected that JSON parsing would be where I spent most of my time. Today, I’ll cover the different approaches you can take when you start Parsing JSON in Go.

All the code and various methods here are sampled from my Golang Json Experiment project if you want to cut to the chase.

There’s really three general approaches to pick from:

  • Define structs for all the JSON you’ll need to parse. This is the most Go-idiomatic way to parse JSON in Go. It’s also the furthest away from using Python dicts if you’re coming from that world and frustrating if the JSON doesn’t have a clearly defined spec.
    • There’s even a few tools that use the explicit definition to pre-generate parsers for better performance.
  • Use empty interfaces to parse the JSON, then convert each field to the type you need. This is promising on the surface but quickly becomes complex.
  • Use one of a few wrapper packages that use an empty interface underneath but make the type conversion more natural.

Parsing JSON using struts

This is the standard path for Go and, cutting to the chase, the option that felt most natural in the end. Though it may seem frustrating if you’re coming from a dynamically typed world, fighting against this will be frustrating in different ways.

My server needed to receive a JSON-encode request and response back with more JSON. I used the standard library’s net/http server (walkthrough if you need it). There’s a bit of boilerplate required to make it behave properly (throw 400 and 500 errors, set the right content-type), so I’ll include that at the bottom of the post.

Your first step is to define your struct. You have a lot of options at your disposal and Eager wrote a great blog post on how to leverage structs to parse JSON. Here’s what a basic one might look like:

type MoveRequest struct {
    GameID string  `json:"game_id"`
    Snakes []struct {
        Coords  [][]int `json:"coords"`
        Name    string  `json:"name"`
        Taunt  *string  `json:"taunt"`
    } `json:"snakes"`
    Turn int `json:"turn"`
}

A few things to note here:

  • the json:"game_id" fields on the right are called tags and are used by the standard JSON package to map JSON fields into the right field in a struct. They’re not always required (Go will try to match up names ignoring capitalization for example), but it’s helpful documentation
  • You can include arrays and nest structs in your definition.
  • One place I got caught: If your JSON can contain nulls then you need to do a bit more work. It turns out that Go variables can’t be nil, only pointers can be nil so you need to use *string as the type for a field that might be null, and refrence it appropriately in your code. You can see an example of this with taunt in the above struct.

Now, creating a structure for an arbitrary JSON API can be a daunting task, but thankfully there’s a few projects that will help you do this automatically just by providing a sample of the JSON. I tested the four I found (listed in the repo) and prefered the output from
gojson the most as it nailed the formatting and naming, even getting capitalization right. If you’re in a hurry, json-to-go will do the conversion in your browser. Keep in mind, it’s automated. So take a look at the output and make any changes you need.

With your structure, you can now parse your JSON with the built in encoding/json library.

func MoveHandler(w http.ResponseWriter, r *http.Request) {
    request := MoveRequest{}
    err := json.NewDecoder(r.Body).Decode(&request)
    if err != nil { 
        http.Error(w, err.Error(), http.StatusBadRequest)
        return
    }

    ...

You may see other examples that use Unmarshal, you really shouldn’t use that unless the data is already in memory (see Datadog’s love letter to io.Reader).

Once that’s done, you can access all the fields in a very natural way:

    fmt.Printf("%+v\n", request) // The + in %+v adds field names when printing structs
    fmt.Printf("Turn: %d\n", request.Turn)
    fmt.Printf("Snake name: %s\n",request.Snakes[0].Name)

Generating a JSON response follows pretty much the same model:

    ...

    // You'll define this struct beforehand
    response := MoveResponse{Move: "down"}

    jsonResponse, err := json.Marshal(response)
    if err != nil {
        http.Error(w, err.Error(), http.StatusInternalServerError)
        return
    }

    w.Header().Set("Content-Type", "application/json")
    w.Write(jsonResponse)
}

Nothing revolutionary here, but you can start to add some nice things. Go’s function model lets you hang functions off any struct. For example, I can add a String() function to the MoveRequest that will automatically get used whenever I try and print it, making logging and debugging easier:

func (m MoveRequest) String() string {
    return fmt.Sprintf("Game [%s] is on turn %d", m.GameID, m.Turn)
}

You can use this feature with the built in JSON decoder to change the behavior by implementing the func type UnmarshalJSON(b []byte) error interface on your struct. Let’s change Coords from a two dimensional array of ints to an array of X,Y Coord type.

type Coord struct {
    X int
    Y int
}

type MoveRequest struct {
    GameID string  `json:"game_id"`
    Snakes []struct {
        Coords  []Coord `json:"coords"`
        Name    *string  `json:"name"`
    } `json:"snakes"`
    Turn int `json:"turn"`
}

func (c *Coord) UnmarshalJSON(b []byte) error {
    var tmp []int
    if err := json.Unmarshal(b, &tmp); err != nil {
        return err
    }

    if len(tmp) != 2 {
        return errors.New("Coord only accepts a length two array")
    }

    c.X, c.Y = tmp[0], tmp[1]
    return nil
}

Then accessing the values is straightforward:

    snake = request.Snakes[0].Coords[0]
    fmt.Printf("Snake position: %d,%d\n", snake.x, snake.y)

Using empty interfaces instead

But what if you don’t want to do all that struct work ahead of time? I sure didn’t, so I kept looking. I quickly found the empty interface but it took me a lot longer to realize the problems.

Here’s the theory: Every variable in Go must be typed. But a type can be an interface and an interface can have zero functions. This means that you can catch any variable of any type by looking for interface{} which every possible type satisfies. So instead of doing the hard work defining a struct, you can just import your JSON into a map of strings -> empty interfaces like so.

var request map[string]interface{}
err := json.NewDecoder(r.Body).Decode(&request); 

Well that sure looks easy! By default, golang looks at the input and tries to unmarshal to the right type, then you just have to cast to the variable you need. Turns out, that’s much easier said that done.

turn := request["turn"] // returns a interface{} typed value

So I need to type assert it using .(int) syntax. This immediately breaks. It turns out that go marshals ints into float64 because it can’t confidently move between JavaScript’s int and Go ints. There’s a long discussion about how the community thinks its a bug and Go creators effectively telling themselves to Go &^#@ themselves. What this means is you actually need to do this:

turn := int(request["turn"].(float64))

Well that’s ugly. What about this JSON.UseNumber thing? That seems to inspire some confidence. Nope, just as ugly:

turn, err := request["turn"].(json.Number).Int64()

And accessing nested values get even crazier:

taunt := request["snakes"].([]interface{})[0].(map[string]interface{})["taunt"]

At this point, I quickly wrote this off. But I couldn’t be the only person looking for a more flexible way to parse JSON. Turns out I wasn’t.

Wrapping empty interface maps in syntactic sugar

There’s at least three projects out there that wrap importing into an empty interface to make accessing the values easier. None are perfect, but they might be a good choice depending on your use case.

typed

This was my quick favorite because of the terse syntax. Even digging multiple levels is relatively straightforward:

request.Objects("snakes")[0].String("taunt")

Unfortunately, ran into a few issues:
– No support for multi-dimensional arrays (no .Arrays() import)
– No io stream initializer
– No ability to create JSON, though there is a sister project that offers this ability

go-simplejson

It’s syntax is a bit more verbose, requiring you to switch between gets and type assertions:

taunt, _ := request.Get("snakes").GetIndex(0).Get("taunt").String()

On the plus side, it includes the ability to create JSON built in. Though it also struggles with multidimensional arrays, it doesn’t make it impossible.

jason

This syntax is also verbose, and because each step returns a possible error, it’s hard to chain. So the one liners above become:

snakes, _ := request.GetObjectArray("snakes")
snake := snakes[0]
taunt,_ := snake.GetString("taunt")

It’s also impossible to handle multi-dimensional arrays and no JSON creation, and this one seems to be missing a lot of the sugar.

End of the day

Despite all my attempts to break away from the default, I couldn’t find a solution that felt right. I think typed might be the closest to handling arbitrary JSON, but it stills falls short for this case. I’ll keep an eye on it.

For now, the best “feeling” solution for what is now a very familiar API is to stick to structs. I’m still concerned about having to do all that work upfront before testing any API, but maybe that’s just what idiomatic feels like.