Go Service Development, Our Key Principles

To get the best results from a large development team it is important to define a set of principles people can work to whilst writing software. The principles should be based around learnings and what has been proven to work for the specific problems an organisation faces. This is bet365’s version of those principles for golang, but a lot of the content is around sensibilities for any programming language.

In order to describe the shape of what we want our software to look like, it’s best to focus first on the fundamental goals we’d like to achieve.

Performance
Maintainability

These points should serve as a litmus test for our code; in that the software we write must satisfy each.

As we progress through the following sections, it is important to remember that any direction given is to satisfy at least one of the above goals and has been shaped by our experiences (and mistakes!) delivering successful codebases.

The document attempts to instil a thought process rather than serve as a canonical list of things to do and not do. The examples given are real-world examples but are not exhaustive.

Maintainability

Software is read far more than it is written.

Stating maintainability as a key principle is a nod to the fact that the software you write spends the majority of its life being looked after, the maintenance phase, rather than being written, the development phase.

Our software must be easy to change, release, test and delete.

To break down what maintainability means to us, we can use the following characteristics

Simplicity
Discoverability
Consistency

Simplicity

One of the hardest things in software development is distilling something into its simplest form.

Keep simplicity and readability in mind when structuring your application. Do the simplest thing.

Reading code is understanding code. Any technique you use which imposes an investigation on the reader, even scrolling to the top of a file, increases the time to understand.

Don’t add unnecessary levels of abstraction, hierarchy or indirection.

Put it all on the table. Don’t hide things away. Examples for this include (but are not limited to)

Using ‘middleware’ patterns like wrapping HTTP handlers. This obfuscates the full functionality of the handler away from the reader. Put the required functionality into the handler as a function call so that is visible.
Single-use constants at the top of the file or other packages provide no readability benefit. It’s ok to place string/int literals inline – use a comment if intent needs to be conveyed.
Passing a single struct in replacement of parameters to a function purely because there are a lot of parameters. Using a struct removes an explicit data-contract between two parts of the codebase and makes the input requirements of a function difficult to ascertain. For example, when initialising structs using named fields, any fields not declared are assigned the default value for their type; meaning adding a new field does not automatically report values not being passed at compile-time. Likewise they hide when fields become unused by the function.

Get to the point quickly. Use top-level entry-points and handlers to start composing what happens as soon as possible. Practices which needlessly offload composition further down the call stack do not aid readability or understanding.

Go is our language of choice because its design is fundamentally based around simplicity and necessity.

Composition

To be more specific about composing at the top, try to imagine your code as a pipeline. Use the top level to pass any relevant return values along the chain until it completes. The goal is to convey the significant steps required to achieve the intended functionality of the code in one place in a way that can be quickly understood by the reader.

As a practical example, this chain may be

Parse & validate input parameters
Speak to session management
Speak to the database or backend
Perform business logic or modelling using the collected data
Serialise & write a response

Composing these steps at the top level aids readability and understanding of distinct parts of our code. By performing this composition at the point of entry, we’ve conveyed the purpose of a handler to the reader in the shortest amount of time possible, in the fewest leaps. The individual tasks themselves, such as speaking to the database or backend, are logical leaf points into other functions or packages.

Try to keep functions as logical entities instead of using techniques which involve splitting code into different micro-functions purely for ‘readability’ purposes based on number of lines or similar. Moving code into another function does not guarantee more readable code and can have the opposite of the desired effect. There’s nothing wrong with a long function if it encapsulates the intent of a task.

Try to avoid API’s which rely on boolean parameters being passed where the values are known at compile time. e.g remove(false). Instead compose a separate function to allow the intended code to happen unconditionally at runtime.

At a top level, to accommodate readability of the pipeline/composition, try to avoid excessive error handling by following the advice of this article – https://bet365techblog.com/better-error-handling-in-go. This details the defer error pattern and advice on avoiding needless ‘wrapping’ of errors. TL;DR – focus on putting error information as close to the error site as possible and not repeating yourself (wrapping) all the way up the stack to generate a poor-mans stack trace. This can remove a lot of easily-accumulated error handling noise from the code.

Discoverability

Writing code which aids maintainability by being easy and predictable to navigate has become even more important given the recent and constant increase in number of services and volume of staff working across them.

Naming

Naming is hard, but worth the effort.

Try to find the right balance of descriptive and concise. Avoid names which are overly long or indulgent by using context to your advantage. Consider how things will read in practice, does it stutter? Are parts of the name implied via a package or struct you are a member of?

Single-letter variable names are fine when operating within proximity of the declaration, and can bring focus to the details of the operation being performed. The further away you are from the declaration, the more descriptive you’ll need to be.

When something does warrant more context, avoid being general or vague where you can be specific. Conversely, don’t call something a potato-based baked snack when you mean crisps…

Stick to common naming themes and patterns used throughout our services to make transitioning between codebases less of a cognitive burden. For example, you’ll find a db package in any service which makes database calls, packages named after the services they are responsible for calling, a data package to declare any structs which breach the service boundary and HTTP handler functions which are named directly after the specific URI they’re handling.

Structure & Encapsulation

Keep your applications natural entry-points such as func main, HTTP handlers et cetera in a servicename.go file within the root directory of your project. Don’t immediately link your handlers off into other packages to do the work. You can branch off to other packages from your top-level as previously stated, but knowing where to start looking for a problem is being half way to fixing it. The more predictable and consistent we make an applications structure, the more we reduce the burden on fresh eyes understanding it.

When you create new packages, the default stance for each package should be a single file named after that package. You should expose the absolute bare minimum number of artefacts on that packages public interface.

Go doesn’t provide many ways of achieving encapsulation of your code – in fact there aren’t many examples in go where you’re given a lot of options for achieving any one goal, it’s arguably one of its strongest selling points as it leads to consistency.

Packages provide encapsulation.
Files do not provide encapsulation on their own. All package-level items are accessible across all files within a package.
Nested packages do not ‘belong’ to their ‘parent’ package – anything they export is accessible to the whole application.
A struct with methods can provide a level of encapsulation.

With those four points in mind – and aiming to achieve our goals of code being maintainable, simple and discoverable – scattering code across arbitrarily named files inside a package doesn’t tick all the boxes.

Put your package level functions in your package file. If you think branching out into other files is needed, you can do on a per-struct basis, where the file is named after the struct lowercase, e.g buffer.go would house type Buffer struct {}. The file should contain the struct, its methods and anything needed to create it, like a NewBuffer function. Method receivers for a struct should not live in a different file to where the struct is defined. Anything outside of this should be considered for a new package if separation is required.

To re-cap

Entry points in the program.go top level file.
Branch into logically grouped and named packages where sensible.
Start a new package with a .go file named after the package.
Branch into additional files within a package for self contained struct+methods instances where sensible.

Consistency

First of all … use go fmt!

Though a lot of what we’ve covered up to this point has inferred the benefits of consistency, it is worth reiterating the point; consistency reduces the time it takes fresh eyes to understand new services. It makes processes and patterns easier to employ and become commonplace. It reduces friction in design and problem solving when switching between our many service codebases by becoming a common language.

By employing consistency in our approach it becomes easier to predict how our software will perform – which means we will spend less time optimising and refactoring our software as an afterthought to get the desired throughput.

Performance

Performance is a key part of our product offering.

We want our users to have the best experience whilst using any part of our website; whether they’re browsing, placing a bet, cashing out or changing their contact details.

This being said, it is not the only reason we want our code to run efficiently. Over the past few years we’ve reduced portions of our production hardware’s footprint to a fraction of its previous size, whilst also virtualising it.

Aside from the obvious cost benefit, running fewer servers has opened doors to in-memory caching techniques that would have previously not been effective over hundreds of servers. This has become a simple and effective staple of how we write our modern services.

What can we do to make go fast?

The Garbage Collector

A good chunk of making go fast is playing nice with the garbage collector.

What this means in practice is we should try to avoid “churn” on the garbage collector (GC), or in more literal terms, avoid allocating lots of objects on the heap that we know we will not need shortly after using them. The go GC is efficient, even when compared to other languages implementations, but it is not free.

There are multiple costs to allocating. First we have the up-front cost we pay synchronously, which you’ll see if you profile your code showing up as a call to mallocgc. The benchmark below is against the same code, but the second run allocates a copy of the data

BenchmarkGet       7.88 ns/op    0 B/op    0 allocs/op
BenchmarkGetCopy   30.1 ns/op   16 B/op    1 allocs/op

Granted for this one occurrence we’re talking about a difference of 23 nanoseconds and 23 nanoseconds isn’t a lot of time at all by itself, but this is one operation. This can and does happen hundreds, thousands or even hundreds of thousands of times during the execution of your handler and a nearly 4x increase can compound into a big problem.

The next price we pay is the work the GC needs to do to figure out this chunk of memory is not needed by us anymore. The majority of this work happens concurrently but there is a small ‘stop the world’ event at the end of the process, at which point our code cannot perform its purpose and must wait. Even though the GC pass happens concurrently, it is not free and is using hardware resources which could’ve otherwise been in use by our code.

Enough of the guilt trip…

What can we do to be better GC citizens? A few things.

Firstly we can use the packages provided in the golang/pkg group. These packages perform pretty boring tasks, but have been designed to reduce or remove memory allocations when compared to their standard library counterparts. They’re focused on operations our services need to do all the time; by using them you’re saving yourself the job of trying to reclaim these same allocations they can save for you. These include the likes of jingo et cetera.

Next, we can assess what options we have available to us in order to avoid or reduce other allocations

Pre-allocate

One large alloc is better than 10 small ones. If the size you need is known at compile time, consider ‘capping’ your make calls or using a statically sized array to allocate on the stack. Even if the exact size is not known, an educated guess will likely reduce allocations.

A lot of the standard library provides alternative call paths which allow you to nominate pre-allocated memory (Buffers/io.Writers) to use instead of returning something that would otherwise be heap allocated. See strconv.FormatInt vs strconv.AppendInt for an example.

Use Pooled Buffers

Look out for the pooling capability built into a few of our packages.

Pooling allows us to spread the cost of allocating across many uses. The price is paid up-front, but then by resetting we can make use of the allocated memory again without paying the cost, be it in the same request or in a new one.

When we’re done with the buffer we pulled from a pool we place it back by calling .ReturnToPool() or similar. Note, it is not valid to use the buffer again after this point. It is best to allow a buffer to be passed in to allow the caller to manage when it should be placed back on the pool, rather than returning a pooled buffer and relying on another part of the system to place it back for you.

Resetting

Resetting a slice, like this…

mySlice = mySlice[:0]

…gives us the ability to append new values to the same underlying memory as if it were empty again, without paying the cost of allocation.

It even works with statically sized arrays

var buf [64]byte // will be allocated on the stack 
addSome(buf[:0]) // reslice the array

/// use buf ... 

addSomeMore(buf[:0]) // overwrite buf, use again

Avoid Work

One way of both reducing allocations and speeding up your code in general is to avoid doing work. Is the code necessary? Is it something which could be done once and re-used? If we step back and change something earlier in the execution, would this problem go away?

One example of this is converting between types multiple times.

A great deal of the input our services take are utf8 serialised values. Granted it does not take a lot of time to convert that country ID you took off a query string into an integer, and it can be stack allocated, but a large portion of the time we’re going to pass that value to a function which then serialises it back into utf8 in order to send it downstream to a database or another service. At this point we could’ve saved ourselves the effort and code-noise of two conversions by using the utf8 formatted bytes we received in the beginning.

In addition to this, using integer conversion as a validation tool can allow things which could be invalid, for example

    fmt.Println(strconv.Atoi("1"))    // 1 <nil>
    fmt.Println(strconv.Atoi("01"))   // 1 <nil>
    fmt.Println(strconv.Atoi("001"))  // 1 <nil>
    fmt.Println(strconv.Atoi("0001")) // 1 <nil>
    fmt.Println(strconv.Atoi("+1"))   // 1 <nil>
    fmt.Println(strconv.Atoi("+01"))  // 1 <nil>
    fmt.Println(strconv.Atoi("+001")) // 1 <nil>

It’s easy to see how this might be used for a cache-flood attack or similar.

Data Formats

When passing data around services, be conscious of the time you spend serialising and deserialising data. It’s not the most obvious performance tax you think you’ll be paying, but in a lot of the services we write it can find its way near the top of performance profiles purely due to volume.

Avoid using JSON where a simpler data structure could be used instead. This is purely because of the amount of time it takes to parse JSON docs compared to simpler key/value formats.

If you can avoid passing structured data around, you’ll find you’re able to be much more efficient when it comes to encoding and decoding. Something simple like url variables or an nlv document are much quicker to parse and will likely be smaller to transfer over the wire.

Network Calls / Amount of Data

Another item you’re likely to see on your profiles is Syscall read/write, which will be prevalent when making any network calls.

Reducing both the number of calls and bytes transferred will always yield a benefit. This should be a concern when designing the interface between services.

Do I need all columns returned by this stored proc?
Do I need the full session or do I need the user id?
Is the call a good candidate for an in-memory cache?
Am I making n calls to a service where I could be making one?
Is the calling profile considerate to varying degrees of latency between data centres?

Taking the time to trim down our interaction with the network layer as a matter of practice will improve how our software behaves long term.

Knowing which one of these techniques will be right for a specific use case requires visibility.

Benchmarking

go test -bench . -benchmem

The tools you get out of the box with go are excellent.

Benchmarks are useful for giving focussed visibility to a hot-path. They provide data on the average amount of time it takes to execute a particular routine and how many allocations that routine made on each pass. Once you’re set up with this information, you can dig deeper into the code by using pprof to visualise both memory and cpu profile output.

pprof

Capture cpu and memory profiles.

go test -bench . -benchmem -cpuprofile cpu.p -memprofile mem.p

View a CPU profile with pprof

go tool pprof -http :9393 cpu.p

View a memory profile

go tool pprof -http :9393 mem.p

From here you can view call graphs, flame charts or even see your source code annotated with the amount of time each line took on a run, or how many bytes it allocated.

With this information you can make better decisions on how to change your software.

Wrkbench

Wrkbench is an internal tool wrapper which combines the wrk http profiler with the golang pprof tool.

Each request-based service we put into production should have wrkbench test associated with it. This is how we get confidence the service or endpoint can meet the heavy traffic demands of our website and be a good citizen when co-hosting with other services on the same kit.

Wrkbench operates by assuming all other pieces of software your service is responsible for talking to, downstream services, databases, session management et cetera, are the fastest software ever written. It does this to shine a light on your code to highlight hot-paths and areas to focus improvements on.

Don’t use wrkbench solely to get the number it spits out, as it means little on its own. Look at the data it produces, get used to seeing the patterns of the flame graph across the services your write and maintain. Soon you’ll be able to spot anomalies and inefficiencies by recognising their shape.

QA Guidelines

In order to be able to encapsulate the principles of this document into a QA (or Code Review) process, the following checks in the last section of this article have been put together.

The checks are written in a way which allows them to be applied to different problem domains and promote a thought process rather than a binary checklist. This means the checks are not exhaustive for all scenarios and should form the basis of a larger process which is inclusive of more domain-specific introspection.

If things can be spotted using static analysis, linting rules or other automated checks then they should be.

QA is the last opportunity we have to make something better. It’s rare you’ll be given code to look at which cannot be improved, however large or small the change, so take the opportunity to comb something out whilst you have it. In the same respect, encourage others to pick bones in your own code to continually raise the bar in the software we deliver.

That said, don’t raise a spelling correction for a change which has introduced a performance or quality regression. Make sure you’re focussed on the guts of the change.

Maintainability

Easy to change, release, test and delete.

Could the software be made simpler?

Is the code easy and quick to comprehend?

Does it impose investigations, no matter how small, in order to read it which could be avoided?
Does it add levels of abstraction/hierarchy/indirection which could be avoided?
Does it attempt to hide implementation details away for what appear to be aesthetic purposes?
Are top-level entry points used to sufficiently convey the purpose of the code?
Is naming employed in a concise but descriptive fashion and sympathetic to the above guidance?

Is error handling sufficient and not repetitive or over-bearing?

Are the files structured and named in a way that is consistent with the above?

Is there anything about how the code is written which feels inconsistent to how the same problem has been solved in other services? Given our focus on consistency, can it be justified?

Performance

Have wrkbench tests been provided?

Do the numbers look reasonable in context with the functionality being tested?

Have benchmarks been provided for hot paths?

Do the times and allocations feel reasonable in context with the hot-path being tested?

Have our internal libraries been utilised where possible?

Is the network being used efficiently?

Could caching be employed?
Are connection pools being used correctly?
Can all calls be justified?
Is structured data being used where a flat structure would suffice?

Closing

The practice of writing software with these principles in mind is about finding the best blend of both performance and maintainability, where each has parity in importance.

You’ll find iterating on “Functional, Fast, Simple” will help you get the right balance here.

Start purely by focusing on solving the problem from a functional sense – code that works. Next consider how the code will perform and how to make it fast. Now, armed with the knowledge of both how to make it functional and how to make it fast, express that code in the simplest way possible and optimise for the reader.

The three-step process above is not a three-step process of writing a service, or even a handler. Instead it is a constant loop whilst writing logical chunks of code. In practice, the physical act of writing code may occur during just one of these steps. Making these considerations at a macro level means avoiding these steps across large bodies of code as refactoring tasks later on, keeping the quality of our software consistently high.