35

Microtask and Macrotask: A Hands-on Approach

 5 years ago
source link: https://www.tuicool.com/articles/hit/RBzi22v
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

C oming from a C/C++ background and after designing my own OS ( just a command-line OS :-) I’ve learned a great deal about tasks, queues, and threads. I’ve learned how they create the illusion of multi-tasking to the user.

In this post, I’ll use this knowledge to show you how

During the execution of two programs, the OS forks the program into each execution space Virtual Address Space defined in the BDT. The OS can switch between the two executing processes and execute each within a specified amount of time, pausing one and saving its current addresses and resuming with the prev paused the program.

While learning languages like Java we often dive into the use of threads. Threads are used to execute a piece of code outside the main executing program. Many programming languages implement parallel programming like .NET languages etc. except for JavaScript, which is single-threaded.

If JavaScript is single-threaded , how do we create and run threads like we do in Java?

Simple, we use events or schedule a piece of code to execute at a given time. This type of asynchronicity is generally called Event Loop system in JavaScript. In this post, we’ll learn how the event loop system works and demonstrate its task queue execution cycle by writing our custom JS engine.

Let’s dive in.

Under the hood: Event Loop, Call Stack, and Asynchronous Code in JavaScript

Being single-threaded, JS uses the concept of Event Loop to create the idea of running multiple tasks asynchronously. Our JS code only runs in one thread while JS uses Event Loop to run code asynchronously.

As mentioned, we attach listeners to events, so whenever an event is fired the callback attached to the event is executed. Before we move further, let’s look under the hood to understand how the JavaScript engine works.

The JS Engine is composed of Stack , Heap and Task Queue .

Stackis an array-like structure that keeps track of the currently executing functions.

function m() {
    a()
    b()
}
m()

We have a m function which calls other functions in its body, a , and b . On execution, the address of the m function function in the memory is pushed onto the call stack. To better understand the concept of a memory address, I will advise you to take some time and read about how the OS is programmed.

Before the JS Engine executes a function it stores the address of the function in the call stack. Wait a second- why would the JS engine store function addresses, parameters on the call stack?

At the lowest level, there are the “registers”: EAX , EBX , ECX , ESP , EIP . These are used by the CPU to temporary store variables and run our program (already loaded in memory). EAX and EBX are used for calculations, ECX is used for counter jobs(loops like for-loop). ESP(Stack Pointer) holds the current address of the stack, EIP(Instruction Pointer) holds the current address of the program to be executed.

RAM                 EIP = 10
0 |     |           ESP = 21
1 |a(){}|
2 |     |             Call Stack
3 |b(){}|             14|   |
4 |     |             15|   |
5 |     |             16|   |
6 |m(){ |             17|   |
7 | a() |             18|   |
8 | b() |             19|   |
9 |}    |             20|   |
10|m()  |             21|   |

This is a rough sketch of how our program is represented in memory during execution.

We see our program loaded, then our call stack, ESP, and EIP. The entry point of our program is m() , that's the reason our EIP is 10 the location of the statement in memory. So during execution, the CPU starts execution by looking at the EIP to know where to start.

Here, it starts at address 10 and executes m() statement.

In Assembly, it is translated to call m . Whenever a call to a function is made, Execution jumps to the function in memory and executes it from there. Then, on completion of the function, the prev function from where it jumped must be continued. The return address must be saved, so the Call Stack comes to the rescue. On every function call, the current value in the EIP is pushed to the Call Stack. In our example when a() is called our call stack looks like this:

RAM                 EIP = 1
  0 |     |           ESP = 20
➥1 |a(){}|
  2 |     |             Call Stack
  3 |b(){}|             14|   |
  4 |     |             15|   |
  5 |     |             16|   |
  6 |m(){ |             17|   |
  7 | a() |             18|   |
  8 | b() |             19|   |
  9 |}    |             20|   |
  10|m()  |             21| 7 |

When a returns 7 is popped from the call stack to the EIP. This tells CPU to continue execution from address 7 . :-)

Let’s see why parameters are also pushed to the call stack. During the execution of a function with parameters, the function uses the EBP register to get values from the stack. The values are its parameters. So before a caller function calls a function, it must first push the parameters for the callee function to access, then followed by the EIP and ESP addresses.

The Heap: Objects are allocated on the heap. Instead of an orderly fashion like in the stack, Objects are created using the new keyword.

const lion = new Animal('lion', 'very_aggresive')

This creates an object of Animal on the heap and returns the address to the lion variable. As the Heap is not orderly in nature the OS must find a way to implement a memory manager to prevent holes in the Heap.

The Task QueueTasks to be handled by the Engine at a later time are enqueued here.

The Event Loopis a constantly running process that checks if the call stack is empty then, proceeds to execute all the callbacks enqueued in the Task queue.

So we have seen that to achieve asynchrony in JS, events are used. In the next section, we will dive deep into the Task Queue to see what happens.

Helpful books, articles:

Microtask and Macrotask

We saw in the last section how JS Engine works. Coming to the task queue, we learned that it‘s’ where callbacks are enqueued and executed when the main thread is done with.

But, deep down the task queue, something else is going on. The tasks are broken down further into microtask and macrotask .

On one cycle of the event loop:

while (eventLoop.waitForTask()) {
  eventLoop.processNextTask()
}

Exactly one macrotask is processed from the queue (a task queue is a macrotask queue). After this has finished, all the microtasks enqueued in the microtask queue are processed within the same cycle. These microtasks can enqueue other microtasks, which will be run until they are all exhausted.

while (eventLoop.waitForTask()) {
  const taskQueue = eventLoop.selectTaskQueue()
  if (taskQueue.hasNextTask()) {
    taskQueue.processNextTask()
  }
const microtaskQueue = eventLoop.microTaskQueue
  while (microtaskQueue.hasNextMicrotask()) {
    microtaskQueue.processNextMicrotask()
  }
}

This might take a long time before the next macrotask is run. This might lead to an unresponsive UI or idling in our application.

To demonstrate that microtasks are run before any macrotask, let’s look at this example:

// example.js
console.log('script start');
setTimeout(function() {
  console.log('setTimeout');
}, 0);
Promise.resolve().then(function() {
  console.log('promise1');
}).then(function() {
  console.log('promise2');
});
console.log('script end');

If we run this, we get:

script start
script end
promise1
promise2
setTimeout

Note, macrotasks are enqueued by setTimeout , setInterval , setImmediate , etc. microtasks by process.nextTick , Promises , MutationObserver , etc.

OK, looking at the output we see that script start is logged followed by script end , promise1 , promise2 and setTimeout . Though setTimeout has a time delay of 0 secs yet it is logged last. Why?

As mentioned, at one event loop cycle a macrotask is processed and then all microtasks queue are run.

You may argue that setTimeout should be logged first because a macrotask is run first before clearing the microtask queue. And, when looking at the script, there is no macrotask enqueued before the setTimeout call.

Well, you are right. But, no code runs in JS unless an event has occurred. The event is queued as a macrotask.

At the execution of any JS file, the JS engine wraps the contents in a function and associates the function with an event either start or launch . The JS engine emits the start event, the events are added to the task queue (as a macrotask).

On initialization the JS engine first pulls off the first task in the macrotask queue and executes the callback handler. Thus, our code is run.

.1) takes the contents of the input file, 2) wraps it in a function, 3) associates that function as an event handler that is associated with the “start” or “launch” event of the program; 4) performs other initialization, 5) emits the program start event; 6) the event gets added to the event queue; 7) the Javascript engine pulls that event off the queue and executes the registered handler, and then (finally) 8) our program runs! —  “Asynchronous Programming in Javascript CSCI 5828: Foundations of Software Engineering Lectures 18–10/20/2016” by Kenneth M. Anderson

So we see the script running is the first macrotask queued. The callback runs our code. Following through, script start is printed by the console.log call. Next, the setTimeout function is called which queues a macrotask with the handler. Then, the Promise call queues a microtask, then the console.log prints script end . The initial callback then exits.

As it is a macrotask, the microtasks are processed. The Promise callback is run which logs promise1 , it returns and queues another microtask through its then() function. It is processed (Remember, microtasks can queue extra microtasks in one cycle, yet all are processes before yielding control to the next macrotask cycle) which prints promise2 . No other microtasks are queued and the microtask queue is empty. The initial macrotask is cleared, remaining the macrotask by the setTimeout function.

At this point, the UI rendering function is run (if any). The next macrotask is processed which is the setTimeout macrotask . It logs setTimeout and is cleared from the queue. As there are no more tasks and the stack is also empty the JS engine yields.

Following in Jake Archibald’s footsteps, lets simulate the event loop process. I simulated the macro/microtask process in JS code.

// js_engine.js
1.➥  let macrotask = []
2.➥  let microtask = []
3.➥  let js_stack = []
// microtask
4.➥ function setMicro(fn) {
      microtask.push(fn)
    }
// macrotask
5.➥ function setMacro(fn) {
      macrotask.push(fn)
     }
// macrotask
6.➥ function runScript(fn) {
      macrotask.push(fn)
     }
7.➥ global.setTimeout = function setTimeout(fn, milli) {
      macrotask.push(fn)
     }
// your script here
8.➥ function runScriptHandler() {
      8I.➥for (var index = 0; index < js_stack.length; index++) {
          8II.➥eval(js_stack[index])
      }
    }
// start the script execution
9.➥runScript(runScriptHandler)
// run macrotask
10.➥for (let ii = 0; ii < macrotask.length; ii++) {
11.➥ eval(macrotask[ii])()
      if (microtask.length != 0) {
          // process microtasks
12.➥     for (let __i = 0; __i < microtask.length; __i++) {
              eval(microtask[__i])()
          }
          // empty microtask
          microtask = []
      }
   }

First, we set up the macrotask(1.) and microtask(2.) queues. Whenever a macrotask function like setTimeout its callback is pushed into the macrotask queue(1.), likewise (2.) when microtask-emitting functions are called.

The js_stack(3.) holds the functions/statements we are going to execute. In essence, it holds the code in our JS file. To execute them, we loop over the stack and call them using the eval function.

Next, we setup macro/microtask-emitting functions: setMicro(4.), setMacro(5.), runScript(6.) and setTimeout(7.). These functions take a callback fn as a parameter and push the fn to either macro/microtask queue.

We listed examples of micro/macrotask earlier. Those functions kind of setup a micro/macrotask when they are called. In our case, we just pushed the callback fn to its corresponding queue. setMicro is a microtask function, so its callback is pushed onto the microtask. We hooked into the setTimeout function and redefined it. So when called in our code, our hook is run instead.

Since it(setTimeout) is a macrotask function, we push the callback to the macrotask queue. setMacro is a macrotask function so its callback is registered in the macrotask queue. We have the runScript function, this function emulates the global “start” event of the JS engine during initialization. Since the global event is a macrotask thingy, we push the fn callback to the macrotask queue. The runScript fn parameter (8.) encapsulates the code in the js_stack (ie the code in our JS file), so when run, the fn callback bootstraps the code in the js_stack .

First, we execute the runScript function, which as we have learned, runs the entire code in the js_stack. As stated earlier, after the stack is cleared and empty. The task queue(macrotask) is run(10.). For each cycle of macrotask execution(11.), the entire microtask callbacks are processed(12.).

We for-loop ed through the macrotask array, and executed the current function in the index. Still inside the loop and index, we for-loop ed through the microtask array and execute all. Though, some microtasks can enqueue more microtasks. The for-loop cycles through them all until they are all exhausted. hen, it empties the microtask array. Then, the next macrotask is processed.

To see this in action, let's say we want to run this JS code:

console.log('start')
console.log(`Hi, I'm running in a custom JS engine`)
console.log('end')

We pick each statement and push it as a string onto the js_stack array:

...
// your script here
js_stack.push(`console.log('start')`)
js_stack.push("console.log(`Hi, I'm running in a custom JS engine`)")
js_stack.push(`console.log('end')`)
...

You see the js_stack is like the code in our JS file. The JS engine read it off and execute each statement. That’s what we did in the runScriptHandler(8.) function. We for-loop(8I.) through the js_stack and execute each statement(8II.) using the eval function.

If we run the program node js_engine.js , we see this:

start
Hi, I'm running in a custom JS engine
end

OK, we are getting somewhere. Now, let’s test our example(example.js) we used earlier to demonstrate micro/macrotask but we a little modification:

console.log('script start');
setTimeout(function() {
  console.log('setTimeout');
}, 0);
setMicro(()=> {
  console.log('micro1')
  setMicro(()=> {
    console.log('micro2')
  })
})
console.log('script end');

We removed the Promises and replaced it with our setMicro function. It’s still the same, both logs a microtask queue. We can see that on the execution of the setMicro micro1 callback from the microtask queue, it logs another microtask micro2 , just like our example.js Promise call did.

So we will expect this:

script start
script end
micro1
micro2
setTimeout

To run it in our custom JS engine, we translate:

// js_engine.js
...
js_stack.push(`console.log('script start');`)
js_stack.push(`setTimeout(function() {
  console.log('setTimeout');
}, 0);`)
js_stack.push(`setMicro(()=> {
  console.log('micro1')
  setMicro(()=> {
    console.log('micro2')
  })
})`)
js_stack.push(`console.log('script end');`)
...

Then, running node js_engine.js , we get:

$ node js_engine
script start
script end
micro1
micro2
setTimeout

Same as it would on a real JS engine. So, we see that our JS engine is correctly emulated the real JS engine.

The runScript registered our code as a macrotask, and on exit, its macrotask callback runs our code which logs script start , setTimeout sets a macrotask and micro1 setMicro sets a microtask. script end is logged last. After each macrotask execution, all microtasks in the microtask queue are all processed. micro1 callback runs which logs micro1 and also, set another microtask micro2 . On exit of the micro1 microtask, the micro2 microtask is run which logs micro2 . On exit again, no other microtask is queued so another macrotask is run. setTimeout is run which logs setTimeout . As there are no more macrotask enqueued the for-loop exits and our custom JS engine yields.

The important points are:

  • Tasks are taken from the Task Queue .
  • Task from the Task Queue is a Macrotask  != a Microtask .
  • Microtasks are processed when the current task ends and the microtask queue is cleared before the next macrotask cycle.
  • Microtasks can enqueue other microtasks . All are executed before the next task inline.
  • UI rendering is run after all microtasks execution.

Conclusion

We simulated the Task Queue in the JS engine in this article and saw how tasks in the Queue are processed. Also, we learned there are more to the Task Queue: microtask and macrotask. All microtasks logged are processed in one fell swoop in a single macrotask execution cycle.

You are free to play around with the custom JS engine we built to learn and understand how the real stuff is done beneath the real JS engine.

Feel free to comment with any questions! You can also contact me through these following links: Email , Twitter , Facebook , Medium , LinkedIn .

Thanks !!!

Resources

To further understand microtask, macrotasks, and the JS engine, here are some useful links:


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK