How to maintain control over API rate limits

Relying on third-party services for a website can often be a risk. Take, for example, the number of websites impacted when Cloudflare has a wobble. But no matter the benefits of an entirely self-sufficient system, sometimes using a third-party service is the only sensible choice.

On a recent project we utilised a third-party API to determine the country of users based on their IP address. The point was to offer the user a choice to visit a translated version of the website if their location didn’t match the language of the page they landed on.

Why an API?

The reason we chose an API over available alternatives was due to the accuracy of the data and that the implementation would be least disruptive to the user.

Often a cheaper alternative, downloading a database of locations quickly becomes dated and inaccurate. And while the HTML geolocation API gives accuracy it relies on the user’s permission through popups. Neither is ideal.

The downsides to an API are normally that they charge for the service, and then charge more based on the number of times you access the API; known as rate limiting.

Rate Limiting

In this particular instance we decided to utilise a service that provided free access for up to 1000 API calls per day. With a little cookie caching of the user’s location, we expected that we’d fall well within this limit; at least until the business grew further.

Yet within days we were being warned that we were approaching and exceeding our rate limit; clearly underestimating our usage.

The easy choice would then be to spend the money and upgrade to a paid tier; increasing our rate limit. But as Yorkshire folk we’re not about to pay for something that we might not need.

Tracking

Before we did anything that would cost more money, even our clients money, we wanted to see for ourselves how many times we actually accessed the API to make sure that we’re using our rate limit efficiently.

We were aware that websites often get visited by various bots, spiders, crawlers and scrappers; and that these might account for some of the API usage. To best identify these we determined that storing the user agent, IP address and date/time was all that was required.

What we’d hope to determine from this data is:

The Results

Unfortunately, our data is a little incomplete. As soon as we activated the tracking, we set to work excluding specific user agents as these instantly appeared to generate more traffic than normal users.

As much as getting a full list of traffic for that day would have been interesting, the result would be yet another day that a useful feature wouldn’t be available to users.

Based on what data we gathered or observed, we know this:

Disclaimer

If this wasn’t already clear, when we’re matching the user-agent and excluding specific bots, we’re only stopping them accessing the API, and the feature which relies on it.

They can still navigate the rest of the website perfectly well; entirely unencumbered. The feature we use the API for is only ever useful to human users and exists in a slightly different form elsewhere on the website.

Why is this important?

In this instance the difference between the free and paid API isn’t all that much per year. Given the figures above we’d expect without exclusions we’d run well within the limit of the first paid tier; that is unless we’re far underestimating the traffic generated by these excluded bots.

But at a much larger scale, the costs could be far more significant. Not all companies have deep pockets and having an awareness of how bots might be affecting your finances might just save you a bob or two.

But what if you scale things up for a much larger website, with more users. Consider if Amazon was to use a similar service. Of course, they would have far more users and so have to pay for their usage. But with greater market share comes greater attention from scrapers, crawlers and bots; with far more traffic from these than a small website might see.

Is any business willing to waste profits on paying for a service where a chunk of their allowance is taken by bots. Just how much could be saved by being mindful of the different types of traffic received and reflecting on if any changes are worthwhile as a result.


We'd love to hear from you!

If you think Bronco has the skills to take your business forward then what are you waiting for?

Get in Touch Today!

Discussion

Add a Comment

Get in touch