Caching Strategy for Gatsby Static site in the cloud

Browser Caching Strategy for HTML, CSS, JS, and Image Files

Vinayak Hegde
5 min readJan 26, 2024
Photo by david hebert on Unsplash

Context

Consider you have a static site built with Gatsby, is hosted on Azure and served through Azure Front Door. Effective caching is crucial for improving page load times and optimizing user experience.

Deciding on the caching strategy for HTML, CSS, JS, and image files could be very straight forward with the modern tooling available in the cloud platforms like Azure, AWS, GCP, Cloudflare etc.

Caching Strategy

For HTML files, <meta> tag can be used to specify caching directives for individual resources within the HTML document. Read more here

For CSS, JS, and image files, we need a different caching strategy. We can use the Cache-Control header to specify caching directives for these resources.

Maintaining multiple approach for caching strategy could be bit of handful. In the context of Gatsby, setting caching headers in the server response is a more robust solution. It ensures that the instructions are communicated at the network level and are effective even if the HTML is served from a cache.

Therefore, I would recommend the approach of setting the Cache-Control header in the server response rather than relying solely on <meta> tags within the HTML document. This way, we will have better control over caching behaviour, a common approach for all static assets/files, and it aligns well with standard web practices.

Here are the Caching recommendations for Gatsby Static sites:

No Caching (must revalidate)

  • html files — public/*.html
  • JSON files in the public/page-data/ directory
  • public/page-data/app-data.json
  • /sw.js
cache-control: public, max-age=0, must-revalidate

Cache for 1 year

  • public/static/*.{js,css,png,jpg,jpeg,gif,svg,webp,ico} excluding (/sw.js)
cache-control: public, max-age=31536000, immutable

Cache-Control in HTTP Headers

We will set the Cache-Control header in the HTTP response for HTML, CSS, JS, and image requests at the cloud platform. This approach provides several advantages:

Pros

  1. Global Control: Setting caching instructions in headers allows us to enforce consistent cache behaviour globally across all requests.
  2. Network Efficiency: Caching headers are communicated at the network level, reducing unnecessary requests and improving performance.
  3. Granular Control: Fine-tune cache directives based on file types, allowing us to optimise caching for specific resources.

Cons

  1. Header Overhead: Headers add some overhead to each HTTP response, potentially increasing bandwidth usage. This overhead, while generally negligible, can accumulate in high-traffic scenarios. It’s crucial to monitor and manage this overhead to ensure optimal performance.

Setting `Cache-Contol` headers through Azure Front Door Rules Engine

Using the Azure Front Door Rules Engine, we can set caching rules for specific file types. This approach is more straightforward to implement and more cost-effective as it does not require the use of serverless functions.

Pros

  1. Global Edge Processing: Azure Front Door operates at the edge locations globally, allowing you to set response headers at the network edge. This reduces the load on your backend servers.
  2. Centralised Configuration: You can configure response headers in a centralised manner using the Azure Front Door portal, and the changes take effect globally.
  3. Performance: Setting headers at the edge can improve response times for clients.
  4. Cost: The Rules Engine is a built-in feature of Azure Front Door, which means you don’t need to pay for additional serverless functions.
  5. Ease of Use: The Rules Engine is easy to use and does not require any additional code.

Cons

  1. Limited to Edge Logic: The Rules Engine is well-suited for edge processing tasks, but if you need more complex business logic, you might need to combine it with other solutions.
  2. Vendor Lock-in: The Rules Engine is specific to Azure Front Door, which means you are locked into the Azure ecosystem.
  3. Less flexible and less portable.

Setting `Cache-Contol` headers through Azure Functions

For Azure, we can use well-known scripting languages (Javascript/Python) Azure Functions to intercept requests and set appropriate caching headers based on the file type. This serverless approach aligns with the serverless architecture trend and integrates well with Azure services.

Pros

  1. Dynamic Business Logic: If setting headers involves dynamic or complex business logic, Azure Functions allows you to execute code in response to specific events, including HTTP requests.
  2. Flexibility: You have more flexibility in terms of the logic you can execute to determine the headers to be set.
  3. Portability: Azure Functions are portable, allowing you to easily switch between cloud providers.
  4. Cost: Azure Functions are billed based on usage, which can be more cost-effective.

Cons

  1. Serverless Cold Start: Azure Functions may experience a “cold start” latency if not frequently triggered, which might impact the response time for the first request.
  2. Complexity: Azure Functions add some complexity to the system, which needs to be managed effectively.

Cloud-Agnostic Approach

To ensure flexibility and cloud provider agnosticism, we will abstract caching logic into serverless functions. This allows us to easily switch between different serverless computing platforms, such as Azure, AWS, GCP, Cloudflare etc.

For a cloud-agnostic approach, consider designing a serverless functions in a way that is easily portable between cloud providers.

You may need to consider following when designing a cloud agnostic approach for chaching strategy:

  1. Avoiding Vendor Lock-in: By abstracting provider-specific code, we prevent dependence on a specific cloud provider, avoiding vendor lock-in.
  2. Increased Adaptability: Separating configuration details from the core logic increases adaptability to changing cloud providers, making the system future-proof.
  3. Flexibility: The cloud-agnostic approach provides flexibility in choosing the cloud provider based on current and future needs.
  4. Performance: The HTTP header-based caching strategy ensures efficient network-level communication of caching directives, optimising performance.
  5. Scalability: Serverless functions allow for automatic scaling, ensuring the system can handle varying levels of traffic efficiently.
  6. Cost: Serverless functions are billed based on usage, which can be more cost-effective.
  7. Complexity: The cloud-agnostic approach adds some complexity to the system, which needs to be managed effectively.
  8. Education: Explore relevant documentation for best practices in cloud-agnostic design.
  9. Review: Regularly review and update the caching strategy based on changing requirements and advancements in cloud technologies.
  10. Alignment: Ensure that the cloud agnostic caching strategy is aligned with the overall architecture of your organisation.

Conclusion

In conclusion, implementing a robust caching strategy is essential for optimising the performance and user experience of any site. By carefully considering the caching directives for HTML, CSS, JS, and image files, we can significantly reduce load times and improve overall efficiency.

The decision to use the `Cache-Control` header in the server response, rather than relying solely on `<meta>` tags or other methods, ensures consistency and reliability across all static assets. This approach provides granular control over caching behaviour and aligns well with industry best practices.

While the Azure Front Door Rules Engine offers a straightforward solution for setting caching rules, worth looking at alternatives like Azure Functions. Perhaps, exploring a cloud-agnostic approach ensures flexibility and portability, allowing us to adapt to changing cloud provider environments in the future.

Regular reviews and updates to the caching strategy will be necessary to accommodate evolving requirements and advancements in cloud technologies. By staying informed about caching best practices and leveraging relevant documentation and resources. You can continue to refine and optimise our caching implementation for maximum effectiveness.

References

Read topic about Cache-Control Meta Tag: Pros, Cons, and FAQs here

--

--

Vinayak Hegde

Dad, Husband, Son, Brother, Coder (mostly JavaScript and python), micro-blogger