3

A History of JavaScript Modules and Bundling, For the Post-ES6 Developer

 10 months ago
source link: https://8thlight.com/insights/a-history-of-javascript-modules-and-bundling-for-the-post-es6-developer
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

A History of JavaScript Modules and Bundling, For the Post-ES6 Developer

Photo by Growtika on Unsplash

At this time, eight years have passed since the release of ECMAScript 2015, the second major update to JavaScript. Many developers, myself included, have only written code in this version.

That means many of us only know the import/ export module syntax and perhaps have never questioned it, especially given the advent of frameworks that automate the setup of the underlying bundling system like create-react-app and NextJS. After all, it's common for languages to have some kind of module system — some way of including external code for the current file to reference and execute. Why would JavaScript be any different?

Well, there is a pretty huge difference, and it's based on where the code is executed. On the frontend, websites are coded in JavaScript, which we zip up and send over the wire to client browsers; on the backend, code (in any language) is executed directly on a server. The external code on which those backend files rely is right there on the server with them and does not need to be sent anywhere. This distinction is the key to understanding the history of JavaScript's module system.

JavaScript has multiple competing module systems

The picture gets even more confusing for JavaScript because, unlike many other languages, it did not incorporate a native module system until the 2015 update. Until that point, developers had used different programs and techniques for referencing external code in their files.

Let's explore the development of the different module systems in JavaScript. With a solid foundation, you can go forth and grok all the new-fangled bundling delights of Vite, NextJS, and all the other cool kids on the block.

In the Beginning, There Were Multi-Page Websites With Global Scripts

In the beginning, there were multi-page websites. There was a server with a router, and client browsers would make a new request at each route for the static HTML+CSS+JS code. Developers could write a file of JS code and include each one in a separate script tag.

JavaScript Modules 1

main.js file:


// main.js

var values = [ 1, 3, 9 ];
var calculation = calculate(values)
document.getElementById("calculation").innerHTML = calculation; 


    

In the snippet above, the main file populates the calculation span, and the main script depends on the calculate script, which (let’s imagine) depends on the add script.

We do not need to use any kind of importing syntax because the script elements have added their variables to the global namespace, so the calculate function is available for the main script to use directly.

For this to work, the scripts have to be in the correct order to be available to each calling script. They are each added to the global namespace, so they can be modified by any other piece of code. They are also synchronous and blocking. Meaning after the client makes the request at the given route for the webpage, at the point the browser engine reaches the first script tag while executing the code, it sends another request to download that script from the server. The execution of the HTML page pauses until the script is downloaded, then parses and executes the script, then continues. In the add and calculate files, we only define variables and functions. In the main file, there is an instruction to do something to the webpage (at document.getElementById...). This runs immediately.

Note that the src of a script can be an external URL. For example, the jQuery library was already available from these early days and used extensively.

This is not a great system. Developers have to manually add scripts in the correct order, and pollute the global namespace, which means it is easy to modify and confuse variables in previous scripts with unforeseen consequences in later ones. There were answers to this in the pre-bundling era — for example, the "Module Pattern," where only one identifier is exported by the external script, such as the $ symbol for the jQuery library. However, these were workarounds and not a complete solution.

And Then Came Bundling

Developers separate code into files for our own human benefit, not for the benefit of the computer. We could write programs as single giant files like they did in the beginning. This way, the browser only needed to download one file from our server (rather than a series of small files), which is usually a lot more efficient overall.

Initially, developers performed this manually by cat-ing all the files together before placing it on the server. Immediately, tools came about to do this step for you, and employed hacks to make the code even more computer-efficient — called minifying, or uglifying. ‘JSMin’, from Douglas Crockford was one of the first tools to do this, for example.

Node's Modules Introduce Backend Runtime

Let's pause here for a moment and take a look at what was happening in backend JavaScript code. Node was created in 2009 by Ryan Dahl, and it introduced a runtime to run JavaScript code on a server. Just like any other language, Node needed a packaging system. The packaging system enables different Node packages (programs) to be shared and used by others, and one was built based on the ‘CommonJS’ standard. You import external code into your file using the require keyword:

var otherModule = require('./otherModule');

And you export the code in your own file using:

module.exports = MyModule;

People began making code packages for Node immediately. These packages were called node modules, and npm was created to manage these — to download them, update them, and so on.

To be clear, the JavaScript language itself did not natively include this require/module.exports syntax. People wrote it in their script files, which the Node compiler and runtime was able to understand and execute. Browsers, on the other hand, were not built to understand this syntax — you could not use the backend Node module libraries in JavaScript scripts for your website.

Modules For the Browser - AMD

On the browser side, feeling jealous of Node's gifting of a real module system, programs were created for the browser to parse and execute a type of module syntax in its own scripts. The most famous of these is RequireJS. The specific module system that RequireJS was created for is the Asynchronous Module Definition (AMD) system. This was a standard which came out of the CommonJS system, and looks like this:


// main.js

define(['calculate, alert'], function(calculate, alert){
	var values = [ 1, 3, 9 ];
	var calculation = calculate(values)
	alert('Hello')
	document.getElementById("calculation").innerHTML = calculation;
})


    

calculate.js and alert.js are dependencies of main.js. Each of those files returns a callback function, which is being used here by main. The browser, on receipt of the HTML file, is given the require.js script to execute, which accepts a file as the entry-point in the data-main attribute:

<script data-main="main" src="require.js"></script>, enabling the browser to be able to parse and execute files written using this AMD module system.

AMD is interesting because it is an asynchronous module system — and in frontend website world, we very much like asynchronicity. In Node and CommonJS, the program pauses while it synchronously loads in the dependency file, and its dependencies, and their dependencies, one by one, as it reaches them in the script (bearing in mind that all dependency code is already downloaded on the server, it just needs to be loaded in, so this is very fast). With RequireJS and AMD, the dependencies load in asynchronously. That means the browser would make a request for the calculate.js file first, then while that was still loading in, would make a request for alert.js. Let's say alert finishes downloading first — the code which uses it then executes. Then calculate(values) executes as soon as all of its dependencies have loaded. (FYI, in modern HTML script elements, we now have two choices when it comes to asynchronous imports — the async or defer script attributes.)

Browserify - Bundling Node Modules For Browsers

Writing JS scripts that use AMD module syntax was a big improvement on what had come before — now you didn't have to manually organize dependency resolution, and scripts can load asynchronously. However, the browser did go off and make individual requests for every dependency, and every dependency's dependency, and so on — many, many little server requests.

But even worse is the issue that RequireJS only works for AMD syntax, not for CommonJS/Node syntax. So many Node modules were simply unavailable to frontend developers — or modules had to be written with two different versions and a big if statement to detect the environment (Node or browser) they were being used in and resolve to the right version accordingly.

And in any case, a library that is written for a server runtime will likely not work in a browser because it expects a different runtime environment — global objects like process.ENV are not accessible inside a browser.

But don't despair. In 2013, Browserify introduced bundling software that solved both of these problems. It enabled website developers to write scripts using CommonJS require/export syntax, using whatever node modules they needed. After running a build command on the server, Browserify takes all those require statements, crawls through all the dependency paths (with the downloaded Node modules), and builds a single bundled file (i.e. a big flat file without any require statements, so the browser, on receipt of it, can just run plain old JS without CommonJS module syntax). And it also takes all the Node built-ins (basically the entire standard library, including the objects I mentioned above) and implements polyfills and shims for them (i.e. replacement JS code), for each browser and browser version, with all the cross-compatibility nightmares that that entails.

So now, instead of the browser going off and fetching dependencies as the code calls for them, it receives one bundled JavaScript file instead and loads that, which you can reference in the HTML file script element like so:

<script src="bundle.js"></script>

Now we have reached the modern JavaScript bundling world. By running a bundler program first to generate a single bundled file, we can apply optimizations — like minifying and uglifying and code splitting and tree shaking.

Webpack and Single-Page Web Apps

Browserify laid the groundwork for the most popular bundling program of all, which I believe you may have heard of — webpack. It was created in 2015 and adopted by React, which helped it swiftly supersede Browserify.

Webpack goes even further than Browserify — it creates a dependency graph for *all* of the assets in a website — JS, CSS, images, SVGs, and even HTML. In other words, it builds a tree mapping of all the references through your files to other files (whether they are JavaScript files, CSS, HTML, or so on), from which it can create a single fully bundled file. It enabled creation of browser-based modules (collections of HTML, JS, CSS, images) — which can also be published to npm and used in projects that have Webpack as their bundler. Webpack, like Browserify, is able to parse JS modules written with AMD or CommonJS syntax, and create an output bundle which the index.html file uses in its single script:

<script src=bundle.js></script>.

Webpack swept through the JS and website world in part because React ushered in the age of the single-page web apps. Instead of clients making a fresh request at each different route of a website, the entire application is downloaded on the client's first request and executed client-side, using mostly JS code.

The notorious problem of single-page applications is that they have a huge initial download, meaning a long time until the website is loaded and responsive, which makes optimizing the bundle even more vital.

ES Modules and Modern Frameworks

And finally we can talk about the native module system introduced by ECMAScript 2015. Unlike all of the other module systems above, they are part of the JavaScript language natively. This is the simple import/export keywords that you are probably familiar with using. All modern browsers have supported this ES module system for a long while already (meaning that they can parse and execute scripts with the import/export syntax natively); and as of quite recently, Node does too.

In order to use them, you must set the type=module attribute in your script element. The main.js entrypoint file can use the import/export syntax and this will cause the browser to load in all dependencies asynchronously. Only type=module scripts can make use of the import/export syntax — the browser will not parse it if contained within a traditional script.

This may come as a surprise. In a typical modern React project, you probably won't ever need to look at the index.html file — and if you do, it may contain no script element at all, or the script element does not have the type=module attribute. Given that that’s required for the import/export syntax to work, what gives? Let’s explore the popular create-react-app setup to understand this.

create-react-app Example

create-react-app is the officially-endorsed setup of React. After running the setup command, the public/index.html file only contains <div id="root"></div> in its body, and no script element at all — yet when you build it using the build command and run it, the website works just fine. What's happening here?

Well, the npm run build script runs react-scripts build. That does a bunch of stuff — see the scripts/build.js file — but essentially it uses Webpack to build a little web app for you in a folder labeled build with an index.html file and bundled JS and CSS files. That output index.html file does contain a script with a src of the main JS bundle, and that is what is sent from the server when a client requests the website.

But it's not a type=module script even though the React files in the project all use the import/export syntax. That's because Webpack is able to parse and bundle all different kinds of JS module systems (native modules, CommonJS, AMD, Assets, WebAssembly). So when you run the build command, the Webpack program crawls your JS script files (via the import/export syntax), including all dependencies, and creates a few files with a bunch of flat, minified code called “chunks.” These chunks do not themselves use the import/export syntax, so are just used in a traditional script element. It is beyond the scope of this article to go into detail about how Webpack creates chunks, but very generally we can say that a main chunk is created for all your code and its dependencies, which is referenced in the script element.

That's the basics of how the create-react-app framework, using Webpack, performs bundling and moduling. Other frameworks, like NextJS, SvelteKit, and so on, have their own mechanics for bundling, which should hopefully now be easier to understand. I’d also recommend checking out this introduction to Vite, which was created on the basis of the native module system.

Go Forth and Bundle!

This little research project into understanding bundling and modules with JavaScript and websites has been sitting on my coding to-do list for a *very* long time (I’m sure you can relate with your own list). The time and effort has already paid off.

In the project I am currently working on, we have a local Docker setup and use Webpack to bundle up our web app. To see changes locally, we had to rebuild the web Docker image, which took 10 minutes each time. Not ideal! In create-react-app, Webpack sets up a ‘devServer’ for you — in other words, it runs a little local server on your computer where you can view the web app through your browser, and detects and immediately reflects any code changes to the local files. On this project, we serve the website from a Sinatra/Puma server in a Docker container, so cannot use Webpack’s dev server. But, with my new knowledge of how bundling works, I realized I could simply run the webpack-watch process and create a Docker bind-mount on the directory for the Webpack output bundle (i.e. the Puma server serves that directory to the client), so that any local code changes are immediately re-bundled by Webpack and will appear in the local website after a refresh. From a 10 minute delay to a 3 second delay — that’s a pretty monumental improvement!

Hopefully you now have enough of a foundation to investigate how you're performing bundling with the module systems of your own projects — and maybe even to improve them.

Suggested Readings

If you’re interested in diving deeper into these topics, there were two articles in particular that helped me understand the lessons I’ve distilled in this article. SungTheCoder published a useful exploration of each implementation of module systems for websites. Nolan Lawson’s article also gives a brief history of JavaScript bundlers.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK