ArchitectureScalingNode.js

How We Built a Scalable Screenshot Service

5 min read
How We Built a Scalable Screenshot Service

Introduction

Building a reliable screenshot service is harder than it looks. When we started SiteSnapshot, we thought it would be as simple as spinning up a few Puppeteer instances. We were wrong.

The Challenge

Rendering modern web pages is resource-intensive.

  • Memory leaks in headless browsers.
  • Handling timeouts and zombie processes.
  • Scaling horizontally across multiple regions.

In this post, we'll dive into how we solved these problems.

Our Architecture

We moved from a monolithic Cron-based system to a Distributed Worker Model.

// Example of our worker queue consumption
async function processQueue(jobId) {
  const browser = await getBrowserInstance();
  try {
    const page = await browser.newPage();
    await page.goto(job.url);
    await page.screenshot({ path: job.path });
  } finally {
    await browser.close();
  }
}

Key Components

  1. Queue Manager: Distributes jobs fairly.
  2. Worker Nodes: Stateless containers that execute scans.
  3. Storage Layer: Supabase for metadata, S3 for images.

Conclusion

By decoupling the scheduler from the execution layer, we achieved 99.9% reliability.

Want to monitor your site visually? Start for free today.