Ilya Volodarsky

Co-founder @ Segment.io

Batching REST APIs

At Segment.io, we’re currently working on building our REST clients for the popular languages, so that our users can consume our ingestion and query APIs serverside.

Our import API also doubles as a batching endpoint, so you can send us more than one event per HTTP request.

Batching is incredibly important in server-side environments. This is because high-performance applications, like web servers, do not have the time to make one or more outbound HTTP requests during every operation.

Putting it into context

Let’s try an example. Say you’re a music site with 30,000,000 registered users. You’re popular so about 2,500,000 people come to listen to your music every day. If these 2,500,000 people listen to an average of 5 songs during their visit, that means your song streaming API endpoint will have to process about 12,500,000 requests today. That’s 145 requests per second if spread evenly during the day, but realistically it’s probably 250 RPS at peak hours (if not more). There is also probably a load balancer distributing the load across N computers. Let’s say there are 10 computers, each serving 25 RPS at peak load. Each request has to respond with the audio stream in a reasonable amount of time, let’s say 300 ms. During this time, the server will be doing a bunch of network IO, including talking to a session store, retrieving the the audio stream from the song database, and framing the response.

Now, let’s say you want to discover why people are cancelling your paid subscription. To figure this out, you’ll definitely want to compare the listening behavior of the people that stayed versus those that didn’t. You might find that people that pay and listen to less than 10 songs a month have a way higher churn rate than those who listen to 10 and above.

To find this information, you might want to report an event to an analytics provider every time the song is listened to. A fast API will be able to return a response within somewhere between 50ms and 300ms , depending on the RTT between data centers. And if we’re both in ua- east-1a (I wouldn’t be suprised), then life is looking up.

Variables:

Peak RPS / instance: 25 requests/sec
Maximum Allowed Request Duration: 300 ms
Average Analytics Request Duration: 150 ms

It becomes clear that the HTTP controllers can’t afford to make a blocking outbound HTTP request during every operation. This holds for back-end ingestion processes, as well as HTTP controllers.

It’s surprising to discover so many REST services out there that don’t offer batching, either in their API or in their clients.

What to Do

  1. Enqueue the task in RabbitMQ / Kestrel / DelayedJob / Resque, and have a seperate instance process the requests.

  2. REST client batches the tasks in memory, using another thread to flush them.

Both of these options are decent. Here are some of the tradeoffs:


RabbitMQ / Kestrel / DelayedJob / Resque

Pros:

  • off-load processing to another instance altogether

Cons:

  • requires additional cluster resources * an external queue service like RMQ, Kestrel, Kafka, or ActiveMQ * another processing instance
  • requires developer time to implement,
  • requires maintaining additional computers

REST Request Batching

Pros:

  • does not require any additional cluster resources
  • plug-and-play, maintained by REST client developer
  • does not require network IO to queue message

Cons:

  • requires memory for queue
  • implementation must be thread-safe
  • implementation must be resource-constrained, so that traffic spikes don’t tip the memory

At Segment.io, we’re taking option 2 for our official REST clients. Our reasoning is that we want integration to be as easy as possible. Part of that is not requiring our customers to write and support extra infrastructure to consume our service.

In doing so, we’re going to be careful to make it: 1) thread-safe 2) resource-constrainted, and 3) performant

Take Aways

  1. High-bandwidth REST APIs should support batch operations.
  2. REST clients should use those batch operations.
  3. High-performance system should not make blocking HTTP calls.

In Part 2, I discuss the requirements we came up with while designing our REST clients.