Beyond 15,000 queries per second with geomarketing and Kuzzle

Version française.
In this article, you will discover one of the features of our backend: the real-time geofencing. First, we will discover what geofencing is and its possible applications, then we will see together how to measure Kuzzle's performance on a concrete use case: geomarketing.

What is geofencing ?

Geofencing is a technique that consists of registering geographical areas so that they can then trigger notifications when a document associated with geographical coordinates enters or leaves an area.

Its uses are multiple, from the management of vehicle fleets such as trucks or taxis to the location of stocks of goods or the distribution of geolocated advertising.

Real-time geomarketing, the advertising grail

Geomarketing offers to broadcast ads according to geographical location. Thus a territory will be divided into geographical areas corresponding to an ad or category of ads for a consumer in the vicinity.

This can represent several hundred thousand different areas and tens of thousands of tests to be performed per second.

In addition, in real-time biding it is essential to obtain very low latencies so that ads are quickly distributed to the consumer.

Today, although many publishers offer off-the-shelf solutions for geomarketing, none of these solutions can achieve an acceptable level of performance for real-time biding.

Here at Kuzzle we are always curious to know how our product behaves in an environment involving large volumes of data and low response times.

The issues of real-time geomarketing seemed interesting to us to carry out some performance tests of our solution.

Implementation of real-time geomarketing with a Kuzzle plugin

Kuzzle is an open source backend offering advanced features including real-time geofencing. It offers all the functionality necessary for application development, authentication, persistence, API, while offering very good performance at a reduced cost.

Its use allows to achieve a high level of application customization while reducing backend development costs.

It is built on proven modern technologies: Node.js, Elasticsearch and Redis.

We have therefore developed a real-time geomarketing plugin to perform load tests and see how our product handles high performance issues.

This plugin extends the Kuzzle API by allowing documents to be associated with geofenced polygons. When a GPS coordinate is sent to Kuzzle, a test is performed to determine if it is included within one or more of these polygons. If this is the case, the corresponding documents are returned to the client.

For our demonstration, we will register 300 000 6-sided polygons that can be overlapped.

Kuzzle geofencing is accurate to a distance of less than one meter.

Of course each request will be authenticated through our permissions system.

Gatling, a load test framework

Performance tests are performed with Gatling, a software written in Scala to evaluate the performance of applications in response to a large number of requests.

gatling

Gatling natively proposes an implementation of the Websocket protocol that we will use for our tests.

Gatling's benchmarks consist of 3 parts:

Definition of the communication protocol and its options
Definition of the test case
Definition of the frequency of virtual users arrival

A test scenario is a sequence of actions executed by a virtual user.

Our test scenario will consist of several steps defined as follows:

Opening a Websocket connection
Authentication to Kuzzle
Sending a geolocated document
Closing the connection

Step 3 will be replayed 2000 times to simulate a large volume of data.

The arrival of new virtual users will be done gradually in order to be able to study the scalability curves and response times according to the number of users and requests per second.

Now that we have our plugin and our benchmark scenario, we're going to need a Kuzzle cluster!

Kuzzle cluster mode

When we developed Kuzzle, we of course paid particular attention to the overall performance of our backend.

However, in the current context of an ever-increasing number of Internet users and the arrival of massive volumes of data, particularly through the IoT, it is necessary to be able to ensure horizontal scalability of infrastructures.

It is with this in mind that we have developed a plugin that activates a masterless cluster mode for Kuzzle. This plugin distributes the load over several instances of Kuzzle to allow it to support several tens of thousands of requests per second.

For their part, Elasticsearch and Redis are products known for their native cluster modes and their ability to manage very large volumes of data.

Elasticsearch-and-Redis-native-cluster-modes

In a Kuzzle cluster, each node communicates and discovers the new nodes automatically. Thanks to this system, it is possible to add as many nodes as necessary to support the load without service interruption.

For our tests, the cluster will be deployed in an AWS environment but Kuzzle can be hosted on any infrastructure, whether it is a private or public cloud or on its own servers.

Reach 15,000 requests per second

For this performance test, we will use a cluster of 2 to 4 Kuzzle nodes.

The selected instances are the smallest of Amazon EC2's m5 family, m5.large instances with 2 vCPUs and 8 GB of RAM.

When performing performance tests, it is very important that the server hosting Gatling is as powerful as possible so that it is not the limiting factor of the test.

We therefore chose an optimized instance for the calculation of the Amazon EC2 c5 family, a c5.2xlarge instance with 8 vCPUs and 16 GB of RAM.

The first scenario consists of 200 users arriving in 20 seconds on a 2-node cluster.

At the maximum of its load, Kuzzle was able to respond to 7000 requests per second with a response time of less than 19 ms for 75% of the requests.

The second scenario simulates the arrival of 200 users in 10 seconds on a 3-node cluster.

This time, Kuzzle was able to respond to a maximum of 10,000 requests per second with a response time of less than 16 ms for 75% of the requests.

The last scenario simulates the arrival of 400 users in 20 seconds on a 4-node cluster.

Kuzzle was able to support more than 15,000 requests per second with a response time of less than 15 ms for 50% of the requests.

We observe that Kuzzle is able to hold a high load on a use case as complex as real-time geofencing on large volumes.

In addition by analyzing the results, we can see that the number of requests per second that Kuzzle is able to handle increases linearly with the number of nodes in the cluster. So to manage larger volumes of data, just add more nodes!

All the code used in this benchmark as well as the detailed reports are of course available on Github.

Write a customized protocol and modify authentication to improve performance

In the team, even though we are quite proud of these results, we can't help but wonder how we can do even better.

As we know our product very well, we have already thought about some parts of our infrastructure that could be improved to achieve even better performance.

First of all, the authentication performed on each request is useless once the websocket connection has been authenticated a first time.

Expect to see this improvement in one of our next releases ;)

The biggest performance bottleneck in this test is the use of the Websocket protocol.

In Kuzzle, this protocol is used in text mode. The text takes up more space than binary data but in addition the serialization of JSON is an expensive operation.

The use of a binary protocol directly above TCP or UDP like ProtoBuf or CoAP would allow to achieve radically better performances.

The creation and use of such a protocol is possible thanks to the multiprotocol management offered by Kuzzle.

I hope you enjoyed this article and that the performance achieved by Kuzzle impressed you as much as we did!

Don't forget that we remain available for any questions about the chat on our website as well as by email ;-)