What happens when your business starts to run at a scale where traditional tools and techniques no longer offer acceptable performance in terms of scalability and time to market?
bet365’s answer, create a specialist R&D team.
bet365 has been growing rapidly since our inception in 2001. A key driver of our phenomenal growth is our single minded use of technology to delight our customers. With this in mind we invested heavily in the latest network, cloud and server technologies.
As a global company our systems are spread throughout the world. We have our own data centres at key locations and run a private cloud over our own dark fibre network.
Until recently the technologies we used to develop and operate our software stack have been pragmatic. We used the .net platform for content delivery and core systems, SQL Server for our main production databases and Java for custom middleware applications.
For a while these platforms proved very satisfactory. Technical teams delivered great functionality within timescales that satisfied the growing demands of the business. Customers loved the results, and our user base and concurrent users just kept on growing (circa 2 million concurrent users on a Saturday afternoon).
However, we saw trouble looming on the horizon.
For years us techies could rely on Moore’s Law to deliver the faster computers we needed to support our ever-expanding user bases using single-threaded software.
Scaling was relatively easy until 2004 when Intel shocked the world by canning its Tejas architecture and instead adopted a multi-core approach to improving CPU performance. Suddenly to support more users we had to learn to program multi-threaded code or shard our applications over ever increasing numbers of servers.
For a number of years we used these techniques and relied on the scalability of our traditional RDBMS systems to support our growth. As time progressed we found these approaches to be a substantial cause of complexity and even the biggest databases started to run out of steam.
More recently the complexity of our systems further increased as we responded to regulatory pressure mandating that data and functionality be heavily customised for many of the territories that we operate within.
A perfect storm was brewing.
With the complexity of our systems growing on so many fronts, even our development teams feared that they would struggle to keep up with our pace of delivery and innovation.
This was clearly unacceptable, so bet365 invested in a specialist R&D team to explore new ways of delivering software, seek out relevant technology and boldly deliver where no-one in the industry has delivered before.
Our R&D team is tasked with exploring the entire technical landscape to identify, prove and adopt the best techniques and tools to solve our very complex scalability and business problems.
As you will see in this blog over the coming months this does not mean that we focus explicitly at looking at the latest and greatest technology (Ok sometimes we do :-)), but instead we prefer to repurpose existing mature tech, that has been proven in other areas and apply them to our domain.
The R&D team is made up from people from a wide variety of backgrounds. Some have been with the business for years; others are new to the company and bring a fresh perspective to our problems.
We’ve got hard core programmers, debaters, freelancers who have turned from the dark side and ex entrepreneurs. We’ve got c programmers, .net programmers, Java programmers, experts in operating systems, messaging and databases.
We value creative thinking and believe that the only way to truly learn is to make lots of mistakes. We don’t want to be precious about technology; we just want to find the best solutions for the problems at hand and continue to keep our customers very happy.
A major area of research we’ve conducted is selecting a new programming platform for a number of our highly concurrent systems. Over the years we have found problems with the traditional Java and .net platforms because they are slow to develop in for what we want to achieve.
As a consequence we’ve been getting very excited about functional programming and in particular the Erlang programming language. Erlang is not new. It was developed in the 90’s by Ericsson to run their AXD range of telephone exchanges. It just turns out that the computer scientists at Ericsson were trying to solve problems in the 90’s that a lot of modern systems have today.
This is a great example of how you can take a mature technology from one domain and re-use it in another.
We love the simplicity / compactness of the language. Its actor style concurrency, immutability, OTP libraries and reliability semantics make it far easier to construct massively parallel systems that in our experience just work. In a future post we’ll go into detail about our experiences with .net and Java and why we are moving away from these for some of our use cases.
Needless to say our first Erlang systems have recently entered production as part of our “InPlay” offering, where we push the live odds of hundreds of thousands of sporting events a year to millions of customers in near real time so that they can bet on events as they happen. We found Erlang to be a much better fit than the Java based solution that we previously used.
We’re now able to handle more concurrent users, with greater reliability and flexibility, less latency, all with less code.
Everyone’s a winner!
Our first technical blog post will describe the process that we went through to select a new programming language for this use case, prove its worth and replace the incumbent Java solution. We’re planning to use this blog to share our more interesting findings and give you an insight into the technology behind the world’s busiest gaming website. We’ve learned a lot of valuable lessons and want to share them with you, so watch this space it’s going to be fun!