2

Faster JavaScript Builds with Metro

 1 year ago
source link: https://medium.com/airbnb-engineering/faster-javascript-builds-with-metro-cfc46d617a1f
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Faster JavaScript Builds with Metro

How Airbnb migrated from Webpack to Metro and made the development feedback loop nearly instantaneous, the largest production build 50% faster, with marginal end-user runtime improvements.

1*RZFWkaoezUfVzTxvpfm2gQ.jpeg

By: Rae Liu

Introduction

In 2018, the frontend Airbnb infrastructure relied on Webpack for JavaScript bundling which had served us well up until then; however, with our codebase almost having quadrupled in the previous year, the frontend team was noticing a significant impact on the development experience. Not only was build performance slow, but the average page refresh time for a trivial one-line code change was anywhere between 30 seconds and 2 minutes depending on the project size. In order to mitigate this, the team decided to migrate to Metro.

Thanks to the switch to Metro, we’ve improved our build performance. In development, the time it takes for a simple UI change to be reflected and loaded (Time to Interactive TTI metric) is 80% faster. The slowest production build compiling ~49k modules (JavaScript files) is 55% faster (down from 30.5 minutes to 13.8 minutes). As an added bonus, we’ve observed improvements in the Airbnb Page Performance Scores by ~1% for pages built with Metro.

1*MlrQ2yEBbHj0OOHc_sYxtg.png

Scaling issues with JavaScript bundlers certainly isn’t a unique problem to Airbnb. In this blog post, we want to highlight the key architectural differences between Webpack and Metro as well as some of the migration challenges we faced in both development and production builds. If you anticipate one of your own projects to scale up significantly in the future, we hope this post can provide useful insights on solving this problem.

What is Metro?

Metro is the open source JavaScript bundler for React Native. While Airbnb no longer uses React Native, we believed the infrastructure could be leveraged for the web as well. After numerous consultations with the Metro folks at Meta as well as some of our own modifications, we managed to build a flavor of Metro that now powers both development and production bundling for all Airbnb websites.

Conceptually, Metro breaks down bundling to three steps in the following order: resolution, transformation and serialization.

  • Resolution deals with how to resolve import/require statements.
  • Transformation is responsible for transpiling code (source-to-source compiler which converts modern TypeScript/JavaScript source code into functionally equivalent JavaScript code that’s more optimized and backwards compatible with older browsers), an example tool would be babel.
  • Serialization combines the transformed files into JavaScript bundles.

These three concepts are the fundamental building blocks to understand how Metro works. In the following sections, we highlight the key architectural differences between Metro and Webpack to provide deeper context into Metro’s strengths.

Key architectural differences between Metro and Webpack

Process JS bundles on demand in development

When we talk about bundles, a JavaScript bundle is technically just a serialized dependency graph, where an entry point is the root of the graph. At Airbnb, a web page maps to a single entry point. In development, Webpack (even the latest v5 version) requires knowing the entry points for all pages before it can start bundling. On the other hand, the Metro development server processes the requested JavaScript bundles on the fly.

More specifically, at Airbnb, every frontend project has a Node server which matches a route to a specific entry point. When a web page is requested, the DOM includes script tags with the development JavaScript URLs. The browser loads the page, and makes requests to the Metro development server for the JavaScript bundles. In Figure 1, we illustrate the difference between our Metro & Webpack development setup:

0*d0S7RQA6IXt1YqAO

Figure 1: Differences between the JS bundle development setups for Metro and Webpack

In this example, there is a web project with three entry points: entryPageA.js, entryPageB.js, and entryPageC.js. A developer makes changes to Page A, which includes only the entryPageA.js bundle. As you can see in Figure 1, in both scenarios, the browser loads Page A (1), then requests the entryPageA.js file from the bundler (2), and finally the bundler responds to the browser with the appropriate bundles (4). With the Webpack bundler (1a), even though the browser only requests entryPageA.js, Webpack compiles all entry points on start-up before it can respond to the entryPageA.js request from the browser. On the other hand, with the Metro bundler (1b), we see that the development server does not spend any time compiling entryPageB.js or entryPageC.js, instead only compiling entryPageA.js before responding to the browser request.

One of the biggest frontend projects at Airbnb has ~26k unique modules, with the median number of modules per page being ~7.2k modules. Because we also do server side rendering, the number of modules we ultimately have to process doubles to roughly ~48k. With Metro’s development model, we saved ~70% of work by compiling JavaScript on demand.

This key architectural difference improves the developer experience, as Metro only compiles what is needed (JavaScript bundles on the pages requested), whereas Webpack pre-compiles the entire project on start-up.

Multi-layered cache

Another powerful Metro feature we leverage is its multi-layered caching feature, which makes setting up both persistent and non-persistent caches straightforward. While Webpack 5 also comes with a disk persistent cache, it isn’t as flexible as Metro’s multi-layered cache. Webpack offers two distinct cache types: “filesystem” or “memory”, which is limited to memory or disk cache, no remote cache capability is possible. In comparison, Metro provides more flexibility by allowing us to define the cache implementation, including mixing different types of cache layers. If a layer has a cache miss, Metro attempts to retrieve the cache from the next layer and so on.

0*bvgMUBI6wm9xe0fZ

Figure 2: How Airbnb configures the multi-cache layers with Metro

The ordering of the caches determines the cache priority. When retrieving a cache, the first cache layer with a result will be used. In the setup illustrated in Figure 2, the fastest in-memory cache layer is prioritized at the top, followed by the file/disk cache, and lastly the remote read-only cache. Compared with the default Metro implementation without a cache, hitting a remote read-only cache resulted in a 56% faster server build in a project compiling 22k files.

One contributing factor to Metro’s performance is its built-in worker support which amplifies the effect of the multi-layer cache. While Webpack requires careful configuration to leverage workers via a third-party plugin, Metro by default spins up workers to offload expensive transforms, allowing for increased parallelization without configuration.

But why use a remote read-only cache instead of a regular remote cache (read & write)? We discovered that not writing to the remote cache saved an additional 17%build time in development for the same project with 22k files. Writing to the remote cache incurs network calls that can be costly, especially on a slower network. To populate the cache, instead of remote cache writes, we introduced a CI job that runs periodically on the default branch commit.

Serialization

In the bundler context, serialization means combining the transformed source files into one or multiple bundles. In Webpack, the concept of serialization is encapsulated in the compilation hooks (Webpack’s public APIs). In Metro, a serializer function is responsible for combining source files into bundles.

For one example of the importance of serialization, let’s take a look at Internationalization support. We currently support Airbnb websites in around 70 locales, and in 2020, our internationalization platform served more than 1 million pieces of content. To support internationalization with JS bundles, we need to implement specific logic in the serialization step. Although we had to implement similar internationalization logic when serializing bundles for both Metro and Webpack, Webpack required lots of source code reading to find the appropriate compilation hooks for us to implement the support. On top of all that, it also required understanding the intricacies of concepts like what dependency templates are and how to write our own. Comparatively, it is a breath of fresh air to implement the same internationalization support with Metro. We only have to focus on how to serialize JS bundles with translation content and all the tasks are done in the single serializer function. The simplicity of Metro’s bundling concepts makes implementing any bespoke feature straightforward.

Challenges of Adopting Metro at Airbnb

Even though Metro has the architectural advantages described above, it also brought challenges to overcome in order to leverage it fully for the web. Because Metro is designed for use in a React Native environment, we needed to write more code to achieve feature parity with Webpack, so the decision to switch to Metro came at the expense of reinventing some wheels and learning the inner working of a JavaScript bundler that is usually abstracted away from us.

In development, we had to create a Metro server with custom endpoints to handle building dependency graphs, translation, bundling JS & CSS files, and building source maps. For production builds, we ran Metro as a Node API to handle resolution, transformation, and serialization.

The surface area of the full migration was substantial, so we broke it down into two phases. Because the slow iteration speed of our Webpack setup incurred significant costs around developer productivity, we addressed the slow Webpack development experience with the Metro development server as our first priority. In the second phase, we brought Metro to feature parity with Webpack and ran an A/B test between Metro and Webpack in production. The two biggest challenges we faced along the way are outlined below.

Bundle Splitting

The out-of-the-box Metro setup for development produced giant ~5MiB bundles per entry point, since a single bundle is the intended use case for React Native. For the Web, this bundle size was taxing on browser resources and network latency. Every code change resulted in a 5MiB bundle being processed and downloaded, which was inefficient and could not be HTTP-cached. Even if the changed code recompiled instantly, we still needed to reduce the size and improve browser cacheability.

To improve the performance of Metro in the Web environment, we split the bundles by dynamic import boundaries, a technique also known as code splitting. The code splitting boundaries enabled us to leverage HTTP caching effectively.

In Figure 3, import(‘./file’) represents the dynamic import boundaries. The bundle on the left hand side (3a) is broken down to three smaller bundles on the right (3b). The additional bundles are requested when the import(‘./file’) statements are executed.

In Figure 3a, suppose fileA.js has changed, the entire bundle needs to be re-downloaded for the browser to pick up the change in fileA.js. With bundles split by dynamic import illustrated in Figure 3b, a change in fileA.js only results in re-downloading of the fileA.js bundle. The rest of the bundles can reuse browser cache.

0*R2QCSRzc7ysunr3_

Figure 3: Splitting bundles by dynamic import boundaries. A bundle is represented by the rectangular boxes with a pink background.

When we began to think about production bundles, we wanted to optimize a bit differently than in development. It takes time to run the bundle splitting algorithm, and we didn’t want to waste time on optimizing bundle sizes in development. Instead, we prioritized the page load performance over minimizing bundle sizes.

In production, we wanted to ship fewer and smaller JavaScript bundles to the end user so the page loads faster and the user experience is performant. There is no Metro development server in production, so all the bundles are pre-built. This makes bundle splitting the biggest blocking feature needed to make our Metro build production ready. With some inspiration from Webpack’s bundle splitting algorithm, we implemented a similar mechanism to split the Metro dependency graphs. The resulting bundle sizes decreased by ~20% (1549 KB –> 1226 KB) on airbnb.com as compared to the development splitting by dynamic import boundaries.

On comparing the bundle splitting results between Metro and Webpack’s implementations, we realized both provided bundles of comparable sizes with a few pages shipping a slightly higher number of Javascript bundles with Metro. Despite the slightly heavier page weight, TTFCP, largest contentful paint, and Total Blocking Time metrics are comparable between Metro and Webpack.

Tree-shaking

Bundle splitting alone decreased bundle sizes significantly, however we were able to make bundles even smaller by deleting dead code. However, it is not always obvious to identify what is considered dead code in a project, as some “dead code” in a project may be “used code” in the other projects. This is where tree-shaking came into play. It relied on the consistent usages of ECMAScript modules (ESM) import/export statements in the code base. Based on the import/export usages in a project, we analyzed what specific export statements were not imported by any file in the project. Finally, the bundler removes the unused export statements, making the overall bundle sizes smaller.

One challenge we faced while implementing the tree-shaking algorithm for Metro production builds was the risk of mistakenly removing code that is executed at runtime. For example, we ran into multiple bugs related to re-export statements. Since Webpack handles ESM import/export statements in a different way, there was no comparable prior art for reference. After multiple iterations of tree-shaking algorithm implementation, the following table captures how much dead code we were finally able to drop given the project size.

1*WlsTTqWIGeJzk_ccUzFuBw.png

Conclusion

The Metro migration brought forth some very significant improvements. The biggest Airbnb frontend project compiling ~48k modules (including server and browser compilations) saw a drop in the average build time by ~55% from 30.5 minutes to 13.8 minutes. Additionally, we saw improvements on the Airbnb Page Performance Scores with the pages built by Metro, ranging around +1%. The end user performance improvement was a nice surprise, as we initially aimed for achieving neutral experiment results.

The simplicity of Metro’s architecture has benefited us in many ways. Engineers from other teams have ramped up quickly to contribute to Airbnb’s Metro implementation, which means there is a lower barrier to entry for contributing to the bundling system. The multi-layered cache system is straightforward to work with, making experimentation with caching possible. The bespoke bundler feature integrations are made obvious and easier to implement.

We acknowledge that the landscape has changed since we evaluated Parcel, Webpack 4, and Metro back in 2018. There are other tools, such as rollup.js and esbuild, that we haven’t explored much, and we know that Metro isn’t a general-purpose JavaScript bundler when compared to Webpack. However, after a few years of working on Metro feature parity, the results we have seen have proven to us that it was a good decision to pursue Metro. Metro solved our most desperate scaling issues by dropping development and production build times. We are more productive than ever with instantaneous development feedback loops and faster production builds. If you would like to help us continue to improve our JavaScript tooling and build optimization, or tackle other web infrastructure challenges, check out these open roles at Airbnb:

Senior Frontend Infrastructure Engineer, Web Platform

Engineering Manager, Infrastructure

Senior Software Engineer, Cloud Infrastructure

Senior/Staff Software Engineer, Observability

Acknowledgments

Thank you everyone who has contributed to this multi-year project. We couldn’t have done it without any of you! Special shoutout to my lovely team Michael James and Noah Sugarman for driving the Metro production migration to the finish line. Thank you Brie Bunge, Dan Beam, Ian Myers, Ian Remmel, Joe Lencioni, Madison Capps, Michael James, Noah Sugarman for reviewing and giving great feedback on this blog post.

All product names, logos, and brands are property of their respective owners. All company, product and service names used in this website are for identification purposes only. Use of these names, logos, and brands does not imply endorsement.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK