How http cache on static assets works out of the box
When someone visit your website on a page with plenty of images, fetching all the resources over the network can be expensive, especially if you do not serve resized images or network is slow. When an asset is displayed on different pages or the visitor returns to your website later, if the browser can re-use previously fetched file instead of requesting a new one, it can benefit everyone:
- Decrease the loading time experienced by your client
- Reduce server load
- Save Bandwidth (for both)
How does my laravel application provide cache ?
These days, I still develop laravel applications the old way. I mean hosted on vps and served with a pretty standard nginx engine. I suspected my visitors did benefit from cache on static assets at some point, but I had no clue how this could work. As the laravel framework handles a lot of stuff for me, I first suspected that it handled it too. Of course it has some capability but after some researches I found nothing related to the static files and set up by default.
Nginx default behavior
I identified headers with the help of the browser dev tools. Unfortunately there isn’t actually a single API for HTTP cache. It’s a collection of different ones, and you can mix them together in multiple ways. Thanx to some great articles, I quickly found out which ones were involved and that everything was working out of the box with nginx default configuration and a modern browser.
Here is a breakdown of what I see in the network section for a jpeg file requested twice with nginx and firefox 84. At first, it looked pretty complicated, but it seems quite simple after all.
- First, the client visits a page on my website that needs an image asset. The browser sends a request for the file with just one specific header, telling the server it has no cache for this file yet:
2. The server sends a response with a status 200 and the whole file. It also adds two headers:
etag: it is a unique identifier generated by nginx for a specific version of the file. If the file changes, a new value is generated.
Last-Modified: it is a “non iso” timestamp based on the “last modified” information found in the linux filesystem by nginx for the file
last-modified: Sun, 20 Dec 2020 23:00:02 GMT
The browser detects those response headers and will store the received file in its cache along the given information (etag and timestamp).
It is important to notice that server doesn’t send any
3. The client visits a new page on the website, that needs the same asset. The browser now finds it in its cache. It still send a request to the server but this time adds these headers:
Cache-Controlis now max-age=0 instead of no-cache. It is a way for the client to ask the server for an end-to-end revalidation (to do so it needs at least one of the others headers)
If-Modified-sinceset to the previously received value in last-modified
If-None-Matchset to the previously received value in etag
If-Modified-Since: Sun, 20 Dec 2020 23:00:02 GMT
4. Nginx will check the file with the given information. As the file did not change (it still has the same etag and last-modified timestamp), it won’t send a response with a status 200 and the whole file this time. Instead it sends a response with a status 304 and no file in the payload. It adds those two headers in the response to confirm the state of the file:
last-modified: Sun, 20 Dec 2020 23:00:02 GMT
5. The browser now can load the file from its cache.
Last-Modified serves the same purpose as
etag, but uses a time-based strategy to determine if a resource has changed, as opposed to the content-based strategy. In theory it would work with just one of them.
In this case, the goal seems to have been reached because the network traffic has been reduced. But a problem remains, the browser always sends a request to the server asking if it can reuse its cached file. Even though the server responds with a 304 instead of sending the file again, it still takes time to make the request and receive the response. If you have a hundred images in your page, it still needs 100 requests.
It is possible to implement more sophisticated solutions based on an expiration date given by the server and that would prevent in some cases the client to make any call at all. Nginx tells the browser that the requested file can be kept locally for a certain amount of time without requesting it again. But unfortunately, it is not out of the box. It needs some configuration and has obvious drawbacks like cache invalidation.
What surprised me, it is that the described technique is pretty old. I think some implemention details evolved but
etagis handled by default in nginx since 2012. I might be late on this one, but I’m glad I learnt something :)