Open Sourcing Jingo, a Faster JSON Encoder for Go

Today we’ve open sourced Jingo, which is a fast JSON encoder library for golang.

https://github.com/bet365/jingo

The interface of the go standard library json encoder implementation of this is really nice – you decorate your structs using tags and then pass them to Marshal, like so…

Example usage, encoding/json


import "encoding/json"

type MyPayload struct {
    Name string `json:"name"`
    Age int     `json:"age"`
    ID int      `json:"-"` // "-"" lets us ignore this field
}

func main(){
    p := MyPayload{
        Name: "Mr Payload",
        Age: 33,
    }
    serialized, err := json.Marshal(&payload) // serialized = {"name":"Mr Payload","age":33}
    // ...
}

This can be all you need in most cases, but in high performance scenarios this default implementation can become somewhat of a bottleneck. This has lead to the creation of quite a few other JSON implementations being driven by the Go community, each solving the performance issue with varying degrees of success and cost of implementation. Generally speaking, the faster the library, the more laborious adapting your structs to support the library becomes.

What we wanted, ideally, was the same (or similar) interface as the standard library but all the performance benefits other libraries like gojay were able to achieve.

So Jingo is what we ended up with and here’s what it looks like in comparison

Example usage, Jingo


import "github.com/bet365/jingo"

// we tag the structs using the same annotations
type MyPayload struct {
    Name string `json:"name"`
    Age int     `json:"age"`
    ID int      // anything we don't annotate doesn't get emitted. 
}

// Create an encoder, once, letting it know which type of struct we're going to be encoding. 
var enc = jingo.NewStructEncoder(MyPayload{})

func main() {
    p := MyPayload{
        Name: "Mr Payload",
        Age: 33,
    }
    // pull a buffer from the pool and pass it along with the struct to Marshal
    buf := jingo.NewBufferFromPool()
    enc.Marshal(&p, buf) // buf = {"name":"Mr Payload","age":33}
}

The main difference to our previous example is we created a new encoder instance using a blank struct of the type we wanted to encode, it’s that instance we call Marshal on.

The next difference is we’re enforcing the use of buffers, using our own lightweight buffer type which has pooling built in.

When we’re done using the contents of the buffer, we call buf.ReturnToPool() so that it can be re-used, which drastically reduces our memory allocations overall.

How does it work?

When you create an instance of an encoder it recursively generates an instruction set which defines how to iteratively encode your structs.

This is what gives it the ability to provide a clear API but with the same benefits as a build-time optimized encoder.

It’s almost exclusively able to do all type assertions and reflection activity during this compile stage, then makes ample use of the unsafe package during the instruction-set execution (the Marshal call) to make reading and writing very fast.

As part of the instruction set compilation it also generates static meta-data, i.e field names, brackets, braces etc. These are then chunked into instructions on demand.

Using the data from the example above, and to simplify the call stack somewhat, what we end up with is a set of lightweight instructions that look something like this

write_buf           `{"name":"`
write_buf_from_ptr  `Mr Payload`
write_buf           `","age":`
write_buf_from_ptr  `33`
write_buf           `}`

As we already touched upon, all of the type information was inferred as part of generating the instruction set, so each instruction knows the exact struct offset and size to read.

The encoders you have available to you as of today are jingo.StructEncoder and jingo.SliceEncoder. These cover the vast majority of our own use cases, but an encoder for maps is coming soon.

How well does it perform?

Very well, as it happens. Take a look at the figures below, they were generated with the gojay (which is fast) perf data, SmallPayload and LargePayload respectively.

Numbers were generated on a Centos 7 Machine, Quad-core Intel(R) Core(TM) i5-4590 CPU @ 3.30GHz.

Lib ns/op B/op a/op +/-
jingo 208 0 0 4.8x
encoding/json 1008 160 1 1x
gojay 605 512 1 1.6x
json-iterator 825 168 2 1.2x
Lib ns/op B/op a/op +/-
jingo 9748 0 0 3x
encoding/json 29854 4866 1 1x
gojay 16884 18308 5 1.7x
json-iterator 21033 4873 2 1.4x

These results can be even more pronounced depending on the shape of the struct – these results are based on a struct with a lot of string data in:

Lib ns/op B/op a/op +/-
jingo 212 0 0 11.5x
encoding/json 2443 720 4 1x
gojay 1147 512 1 2.1x
json-iterator 2606 744 5 0.9x

Jingo is actually quicker than anything else we can find at the moment so it seemed like a good candidate for us to open source and share with the wider community.

Community Contributions

Speaking of, contributions are welcome for Jingo. Please see the github repo for a set of guidelines for doing so.

Future

We’re interested to see how far we can take the set of ideas that form the basis of the encoder. So whilst a decoder is on the cards, there may be other applications and encoding formats that could benefit from this approach.

In the meantime, we’re eager to hear from you about your experience of using Jingo should you choose to do so!