My Chart Will Go On

How we built our in-house Visualization library at Botify

Being a SaaS business providing intelligent data to customers, Botify has always had a need for beautiful, expressive, and pertinent visualizations. The core of our business relies on providing a meaningful way for our customers to understand, analyze, and dissect the metrics we compute for them.

If they can dream it, we can build it

At Botify we believe Engineering to be at the service of the Product team: we don’t like to say “no”. If they can dream it, we can build it.

On some days, all we need is a simple pie chart representing the breakdown of the HTTP code of crawled pages. Other days call for more complex year-over-year weighted double-axis multi-line chart measuring variations in impressions vs clicks for a given keyword.

We wrote a JavaScript library to handle all of the heavy lifting for us, commonize most behaviors between charts, and provide a stable, clean, easy to understand API for writing, from very simple to extremely complex, charts. How do you build a charting library in 2020?

We wanted to take some time in this article to give readers a tour of our Visualization library, but more importantly to share the history and choices that brought us to this implementation. We started writing JavaScript charts for the visualization of the Botify datamodel in 2012, and we are still adding new charts to our platform today in 2020. What’s changed in 8 years, what hasn’t, and how did we get here?

Where Do Broken Charts Go?

State of the Chart in 2012

When writing code in general, but especially when writing libraries, or “code for code”, one’s first implementation is rarely the right one. They say DRY (“Don’t Repeat Yourself”) but at Botify we strongly believe in WET, for “Write Everything Twice”.

We’ve often found that it’s only in the second (or third, fourth, or nth) that we adopt better solutions, because we’ve learned from our mistakes. Here are a few of our mistakes, made when we built our first charts and visualizations in Botify.

Different stacks, same ideas

First, a bit of context. The Botify frontend displayed to our customers was initially built in 2012, using Chaplin, a CoffeeScript wrapper library for architecturing Backbone.js applications.

Around 2014, we found React and fell in love with its ability to describe interfaces with low complexity and an expressive language. We started writing some of our Chaplin Views using React components, and started building charts in React instead of Chaplin.

No matter the framework, using Chaplin Views or React Components, the approach was always the same. We maintained clear separation between the code that fetches the data and the code that displays it.

Chaplin being an MVC framework, our charts had explicit models and views:

The model, expressing the API query, and its transformation
The view, expressing how this data should be displayed

When we started writing charts in React, the framework changed but the architecture remained. Each chart was now expressed with the following props:

buildQuery, which constructs an API query
adaptResponse, which adapts the API response to a common format
viewType, which indicates what kind of chart to render

The Good: Charts as code

In both systems, each chart is expressed as code, which has quite a few advantages:

Flexibility — When a custom behavior is required, it is only a matter of adding a few conditions or lines of code, and we’re good to go.
Low barrier of entry — Because charts are expressed as code, newcomers can immediately dive in, without having to learn a new syntax.
Debuggability — It is very easy to pause execution where necessary, in order to inspect what is happening, because once again, we’re dealing with code.

The Bad: Code duplication

Although it allowed us to quickly address new needs by writing more code, it also led to a lot of duplication, too much code, and ultimately an unmaintainable system. This process wasn’t working anymore, and was frustrating our engineers.

When everything is code, there are as many ways to express a chart as there are charts.

Even if they are working on different data, and displaying it in different ways, all charts end up executing approximately the same operations:

Building the raw API query
Extracting data from the API response
Formatting it to the shape expected by Google Charts

Each chart will perform all these operations, with slight variations, meaning that it is impossible to have a unique pipeline that would do everything the same way everywhere.

The Ugly: High cognitive load

In order to mitigate this duplication, we extracted the smallest unit of logic to reusable functions, but that still wasn’t satisfactory enough, because:

You have to know about all these functions, or risk reinventing the wheel
You have to use them in the proper order/combination, or something might break (we recently introduced TypeScript in our codebase; that would have helped a lot)

Because we take code review seriously, we don’t want reviewers to spend hours wondering what makes a chart different from another, considering they look 90% alike, and work 90% the same way.

Instead, we’d rather our engineers spend their valuable review time looking at the critical portions of the code, trying to improve performance, or discussing architecture issues, instead of navigating spaghetti.

In 2016 our team started growing, and our product and datamodel required more and more complex visualizations with a lot of shared behaviors. Our visualization stack was starting to show its age, and for all of the reasons above it was getting harder and harder for us to work with it.

As a team, we felt that these pain points were too important, and with visualizations being such a big part of our applications we decided to invest time and brainpower in building a more durable system for the years to come.

Unbreak My Chart

Building a better visualization library, as a team

At Botify we think it’s important to spend time designing, in order to spend less time putting out fires once the implementation is released.

Before putting hands to the keyboard, we started by listing and learning from the mistakes of our previous implementation. We set out to build something that responded to each pain point and came up with four guiding principles. Together, they would shape our decisions on how our new Visualization system would be built:

The ownership of the system must be shared by all our engineers
It should enforce a single true way for representing charts
It shouldn’t be tied to a specific API
It should provide developers the tools to make the user experience as interactive as possible

Static definitions, not code

We all agreed that code-based charts were too permissive, and allowed the codebase to slowly drift into chaos. Instead, we opted for chart definitions, which simply consist in a static JSON file, defining:

What to request to the API
How to adapt the received data
What to draw to the screen

We strongly believe static definitions are better for many reasons:

They only allow a finite number of behaviors (i.e. the ones we explicitly added support for), which makes maintenance and review a breeze.
They make it impossible to hack a custom behavior in, without thinking about it in a generic way. If a chart requires an unsupported behavior, we have no other choice than properly refactoring the whole pipeline. And that’s a good thing!
They allow for having a single pipeline, which makes it easier to add a new feature to all charts at once, while keeping the changes as localized as possible. We don’t need to manually update all our charts anymore.
They can be serialized, which may prove useful if we want to add support for fully user-customized pages later on. These charts, with all of their behaviors, are serializable and could be written to a database.

Datasource-agnostic

At Botify, we deal with a lot of different data that comes in various shapes and forms, so coupling our system to a specific API wasn’t an option. Instead, we opted for a design where we explicitly define what kind of resource we are dealing with, which automatically swaps out the fetcher/adapter implementation.

That way, adding support for a new type of resource is a trivial task; it is only a matter of adding an entry in a constant file, a new fetcher and a new adapter.

This agnosticity has another advantage: because we don’t rely on a specific format, it forces us to think in term of abstractions, resulting in a loosely coupled architecture.

No compromise on interactivity

Our chart definitions may be static, but our user experience is quite the opposite, which is why we needed to allow for specific modifications to key behaviors of the chart. Some charts on the Botify platform allow for interactions and we wanted to keep the same level of possibilities for our Product team while maintaining static definitions.

We often need to modify, at runtime programmatically or through UI interactions:

The fields that are requested to the API
The filters that are applied
The type of chart that is displayed
…and every other part of the definition if our Product team dreams it

We implemented them in the form of our aptly-named “modifiers”. They can take the form of dropdowns, checkboxes, sort buttons, etc. and can modify any relevant behavior in the visualization, from the fetching of data to the calculations or displays made with the response.

We built Visualization (great name, we know) over the past 2 years and it has grown into our key frontend framework for writing charts, tables, graphs, trees, and maps. We’ll get the chance to dive into how it’s built, but not today. Those looking for the gritty details will have to wait for our next article. Today we’d like to share how it works in production.

Kickstart My Chart

Let’s put ourselves in the shoes of a frontend engineer at Botify:

The Product team designed an interactive chart for our customers to explore their number of clicks and impressions, broken down by device, then by countries.

In order to implement this chart, we will:

Express an API Query using BQL, our internal API DSL, in order to retrieve the relevant data
Display it in the appropriate format
Allow the user to dynamically select the metric they want to see (clicks, or impressions)
…and last but not least, it’d be nice if we could make it look pretty

Querying the clicks by device

As previously mentioned, our charts are expressed using static definitions. These definitions are written using JSON, a notation almost everyone knows, and that isn’t too verbose.

In order to define a chart, we need:

A query, which defines what data we want to retrieve. Here we are requesting the number of clicks (count_clicks metric), broken down by device (device dimension).
A view indicating how that data will be displayed. Here we’d like a simple Pie chart.

{
  "query": {
    "dimensions": ["device"],
    "metrics": ["count_clicks"]
  },
  "view": {
    "type": "PieChartView"
  },
  "metadata": {
    "name": "Clicks By Device"
  }
}

That’s it. In less than 15 lines of JSON, we are already able to show a chart, using real data, to our customers.

Obviously, most of our charts are more complicated than a simple pie chart. Also, this chart isn’t very interactive - it doesn’t allow our customers to dive deeper into their data. This is why we won’t stop here, and instead try to make it more interesting.

Pie chart, displaying a breakdown of clicks per device

Breaking down by country

The initial specifications stated that we had to present the data broken down by device, but also by country.

In order to do so, we simply have to add a second dimension to our query, and change the type of the view to ColumnChartView (pie charts don't support displaying two dimensions).

{
  "query": {
    "dimensions": ["country", "device"],
    "metrics": ["count_clicks"]
  },
  "view": {
    "type": "ColumnChartView"
  },
  "metadata": {
    "name": "Clicks By Device By Country"
  }
}

Column chart, displaying a breakdown of clicks, by country, then by device

Making it dynamic

Even though our chart is starting to look quite good, our users might want to interact with it, in order to explore their count of impressions, instead of clicks.

One of the key feature of Botify’s product suite, is allowing users to customize their experience, and explore the data scenarios that are the most relevant to them and their current SEO issues.

In order to let them interact with visualizations, we introduced “modifiers”, which basically are widgets that can set values in the internal state of their charts.

There are many different types of modifiers, each identified by its type key. The most common ones are the dropdowns and the checkboxes.

{
  "query": {
    "dimensions": ["country", "device"],
    "metrics": ["count_clicks"]
  },
  "view": {
    "type": "ColumnChartView"
  },
"modifiers": [
    {
      "key": "metric",
      "type": "dropdown",
      "values": {
        "count_clicks": "Clicks",
        "count_impressions": "Impressions"
      }
    }
  ],
  "metadata": {
    "name": "Clicks By Device By Country"
  }
}

Adding this to our definition adds a dropdown with two values. The payload of the modifier simply maps a value to its label.

That way, the dropdown knows what to display, and what value to store in the internal state.

Same column chart, with a dropdown allowing to switch between clicks and impressions

Although we added a dropdown, it isn’t actually connected to the API query, nor can it change the title of the chart.

In order to do so, we can use a special templating syntax, very similar to the JavaScript one.

{
  "query": {
    "dimensions": ["country", "device"],
    "metrics": ["${metric}"]
  },
  "view": {
    "type": "ColumnChartView"
  },
  "modifiers": [
    {
      "key": "metric",
      "type": "dropdown",
      "values": {
        "count_clicks": "Clicks",
        "count_impressions": "Impressions"
      }
    }
  ],
  "metadata": {
    "name": "${metric} By Device By Country"
  }
}

Finishing touches

We are now to a point where the specifications are met, but this isn’t enough.

At Botify, in addition to presenting our customers with valuable and actionable data, we also strive to make the experience as enjoyable as possible. Let’s go the extra mile, add some colors and align columns to make everything look nice.

All our views are extensively configurable, allowing us to specify colors, labels, formatters, etc.

For now, we will only change the color palette of the chart, and stack our columns to make the chart more readable.

{
  "query": {
    "dimensions": [
      "country",
      {
        "field": "device",
        "colorPalette": "devices"
      }
    ],
    "metrics": ["${metric}"]
  },
  "view": {
    "type": "ColumnChartView",
"stacked": true
  },
  "modifiers": [
    {
      "key": "metric",
      "type": "dropdown",
      "values": {
        "count_clicks": "Clicks",
        "count_impressions": "Impressions"
      }
    }
  ],
  "metadata": {
    "name": "${metric} By Device By Country"
  }
}

Same column chart, but with vibrant colors

And there you have it! We’ve built a Visualization that looks and performs great, is easily maintainable, and using barely 30 lines of JSON.

Gallery: a few examples of what Viz can do!

Here are few examples of charts powered by Visualization. They are all expressed as static JSON definitions, just like we demonstrated.

Line chart, with togglable metrics

Line chart, with previous period comparison

Simple data table

The same table, with an embedded chart allowing in-depth exploration of your data

Colored map of the world, with a summary of the top countries as a table

Metrics bar, used to get a quick glance of your main KPIs

What we haven’t built yet

Even though we are super proud of our Visualization library (which powers all our new apps), it still isn’t perfect, nor is it complete.

We have already identified a few aspects that could be improved:

Better concurrent rendering — All our queries being cached, visiting a fully cached page triggers the render of all charts at once, which can slow it down quite a bit.
Better debugging — The whole Visualization pipeline being generic, it’s more difficult to pinpoint where issues originate from. It would be quite nice to have a better debugger, allowing to inspect the state of the definition, before/after each step (compiler, fetcher, adapter, etc).
Better cache handling — Our current cache implementation is pretty simple: we use a hash of the query as a cache key, which works well because 95% of our queries are run against a readonly dataset, so the same query always yield the same response. We’ve recently introduced new features that make it possible for the same query to yield a different response depending on the state of some API resource. This makes caching more tricky, and we haven’t really found a proper solution to that yet.
Better authoring tools — Developer experience could be even better, if we had a dedicated tool, allowing to quickly author a new chart, without having to start a dev server nor opening an IDE. It would also make it possible for non-developers to quickly create proof-of-concept visualizations.

Conclusion

We’ve only shown you the surface of what Visualization can already do today. In the past 2 years, we’ve added features, improved performance, and pushed the boundaries of what our framework can do.

We’ve built, to name a few: efficient caching for the request layer, performant rendering, inter-dependent modifiers, generic CSV exports for the charts’ data, and request batching for multiple charts on a single page.

We’ve gone very far in our quest to build a single true way to express a chart, and are extremely proud of having built a toolset that responds to our needs perfectly, solves major issues for us daily, and most importantly has a shared ownership amongst all of the JavaScript engineers at Botify.

Since its release, every single visualization built within Botify’s product suite uses this library, and our JavaScript engineers are loving the flexibility and speed with which they can implement and maintain charts, tables, graphs, and maps.

We’ll be diving in deeper into some of the bricks that make Visualization in later articles.

What would you like to know more about? Our cache system? Render pipeline? Templating language? How far we went with modifiers? Sound off in the comments below!

My Chart Will Go On - Botify Labs - Medium

My Chart Will Go On

How we built our in-house Visualization library at Botify

Where Do Broken Charts Go?

State of the Chart in 2012

Different stacks, same ideas

The Good: Charts as code

The Bad: Code duplication

The Ugly: High cognitive load

Unbreak My Chart

Building a better visualization library, as a team

Static definitions, not code

Datasource-agnostic

No compromise on interactivity

Kickstart My Chart

Querying the clicks by device

Breaking down by country

Making it dynamic

Finishing touches

Gallery: a few examples of what Viz can do!

What we haven’t built yet

Conclusion

Recommend

Will It CORS?

网易云音乐真是太恶心了

Nushell 0.13.0

一起围观由React Hooks防抖引发的面试翻车现场

Kogito: A Modular Codegen Design Proposal

Creating REST Web Services with Spring Boot 2 Running on Payara Micro

我的电脑不联网很安全黑客：你还有风扇呢

拼多多股价上涨3.77% 市值超615亿美元

为啥 TiFlash 又变快了？

旦复旦兮！ACL 2020 复旦大学系列论文解读开始了！

About Joyk