Batch vs single APIs

The question is whether one should default to having a batch/bulk API (scatter-gather) instead of asking the client to make single parallel calls.

Batch API pros:

Reduce network bandwidth and latency
Performance gains in batching (reduce network bandwidth and latency)
- Underlying data might be in the same block of data that the OS might fetch, allowing you to exploit localities.
- In a typical RPC processing path, there are a lot of technical layers before the actual application functionality code is executed. These do become significant if you have to invoke 100s of single record APIs to service one end-user request. e.g. protocol parsing, authentication, authorization, metrics initialization/tracking, logging, configuration, experimentation etc. Each of these things have setup, processing, and teardown within each request-processing lifecycle, and they end up becoming a significant percentage of your CPU cycles consumed if your application business logic code becomes very small.
Gives the caller the choice of using this or sending items one at a time
Micro-batching is a very standard and well-accepted technique in both user-path APIs as well as non user-path (data processing) APIs. All of the negative points about instrumentation for monitoring, error handling etc are quite easy to address once and it is not an overhead on a per feature API implementation basis.
In any kind of content serving application (lots of product listings, search result listings, friends/contacts, merchants, orders/transactions) where there is seemingly a lot of records to fetch/format/serve and UX can tolerate a degradation in result set (partial results, less attributes per result etc.), this type of batch API makes a lot of sense.
Most use-cases are batch fetches and there are far fewer use cases for single item fetch APIs. Usually, single item fetch APIs are used in the context of a self profile fetching scenario (like my own profile page, or my settings page) and those usually have a lot less traffic than the pages that serve a lot of records together on a single page.

Batch API cons:

Batching complicates reporting partial failures and the solutions tend to be ad hoc and not play well with general-purpose mechanisms like automatic retries.
Batching complicates monitoring, since you can’t interpret most built-in RPC metrics without also considering the batch size. This requires you to introduce a number of custom metrics where otherwise the defaults would have sufficed.
Variable batch sizing introduces randomness into memory/CPU use, again complicating monitoring/provisioning.
In the limit of large data, two random keys will probably not map to the same underlying partition anyway, so this benefit may be overstated in general. Non-random access (like range scans) could probably be better handled with purpose-built APIs
Batch APIs introduce synchronization points which could end up hurting performance overall. Consider this example: you need to fetch n items from service A and run them through service B. Each service has a mean latency of 50ms and a tail latency of 200ms, with the tail being driven by slow IO for specific keys regardless of batching. For moderate n, the batching mean latency would be ~400 ms since both batch calls would take 200ms; for non-batched, it would be ~250 ms, since any given key is unlikely to be slow for both A and B.
Puts more burden on the clients to handle partial failures correctly, and that can be error prone.
Marginal gains in efficiency does not justify complicating the API surface.

Batch vs single APIs

Related Notes