2

Trainline’s journey to speed up the customer experience: getting set up

 2 years ago
source link: https://engineering.thetrainline.com/trainlines-journey-to-speed-up-the-customer-experience-getting-set-up-fa6392a1bac8
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Trainline’s journey to speed up the customer experience: getting set up

1*9U2ua6GylSWpb6v1ieVGKg.jpeg

The Trainline booking flow is a Node.JS/React web app which allows our customers to search for and book train and coach tickets. Over time, we’ve tested and launched many new features such as more advanced real-time information, logged-in customer profiles and tools to help our customers find the best price, but this has increased the size of the application code we ship to our customers.

During the last 18–24 months, we’ve made concerted efforts to improve our web performance in particular areas, largely our landing pages and booking flow web apps. This blog post will start to take you through how we set ourselves up to do this and the improvements we made.

Measure

Before we even got started on improvements, we needed a way to prove that the work we were about to do would positively impact the user experience by measuring where we were right now.

Starting from scratch today, you’d want to check out Google Web Vitals as a good first set of metrics to track, each of which tries to provide a signal that together represent a good experience on the web.

In our case, web vitals weren’t yet fully baked when we started on this journey, so we used a number of existing metrics to build up a view of our web performance at Trainline. These were:

  • time to first byte (TTFB)
  • first contentful paint (FCP)
  • largest contentful paint (LCP), a web vital, and
  • first CPU idle (FCI)

Our measurement MVP

  • Record as much data as possible — metrics and waterfalls, continuously over time;
  • With as long a data retention as possible;
  • Correlate this data with website releases

You can, of course, build your own tooling to push this data into your existing application performance monitoring (APM) vendor given that many of these metrics are available either in Performance APIs like Navigation Timing API or PerformanceObserver. However, metrics are only part of the story — of equal importance was how those metrics landed in the waterfall so we could investigate opportunities and diagnose issues. So, we integrated with SpeedCurve which provided most of this out the box, and by adding speedcurve-cli to our web app build pipeline, specifically `speedcurve deploy`, we could do the correlation with releases too.

Synthetics or Real User Monitoring (RUM)? You’re going to need both.

Synthetic monitoring reports web app performance data from traffic generated by a test agent in a consistent ‘lab’ environment, segmented by a number of variables including region, browser and emulated device type. Real User Monitoring captures performance data from a customer’s web browser and their interactions with your web app — it is often referred to as ‘field’ data.

Synthetics will give you additional details you don’t get from RUM such as a waterfall and profiler information that can be used to investigate further, but the tests are run using simulated throttling of network and CPU, and the resulting data does not reflect real user experiences. The results you get from Synthetics runs are comparable with each other if you keep the conditions the same, such as the machine you run it on. Synthetics are run in a real web browser, thus the waterfall you get will be the same as that experienced by real users of that browser using your web app. By being aware of and reducing variability, comparing lab data is often a good way to validate a proof-of-concept performance win.

Before we introduced SpeedCurve, the only Synthetics we were doing were on-demand from WebPageTest.org or Lighthouse running on our machines with the results discarded, so we weren’t able to spot trends or do comparisons. Our only long-term data was RUM data, giving us time series performance data from real users, but not to the level of detail we needed to find possible improvements, to verify an improvement did what we expected, or why a metric regressed after release.

Synthetics is what engineers should be working with when doing performance work as it provides all the details necessary to assess website performance. Furthermore, improvements in Synthetics will generally be reflected in RUM. But RUM data is what should be reported to the business given it is a reflection of real user experiences of the web app.

Do we have enough metrics?

Once we had SpeedCurve set up, the set of metrics exposed a large period of time between those that land early, such as TTFB, FCP and LCP and those that land later such as FCI. We wanted more granularity on what was happening in this period.

1*yRZZceJE409x_7v-3gmZ0Q.png

Historical Trainline performance from SpeedCurve

So, we introduced some new timing marks using the User Timing API. Some of these were defined by engineering as important timing points in the web app lifecycle, such as the time when the app downloaded all bundles or when the app booted. Others were defined in collaboration with product stakeholders as important timing points during page load, marking when the most important content on the page was first painted in the browser, such as the search form, or the search results.

1*XKOGpOItTVNbY1NKBeVFug.png

Historical Trainline performance from SpeedCurve, annotated with new user timing marks

User timing marks have broad browser support and are visible in all performance tooling such as Lighthouse, the Chrome devtools profiler and SpeedCurve. This allowed us to do controlled experiments of a performance opportunity on our machines, record if there was an improvement, and if so, roll it out to have it reflected in Synthetics in SpeedCurve, and lastly, real users.

So did we have enough to start improving?

Almost — now we have a regular snapshot of our web apps performance, how are we benchmarking? We looked at similar experiences on the web and other applications in our web stack that we could learn from and provide the most performant experience possible.

Key to this was having long-term consistency which meant ensuring that benchmarks were run on the same simulated hardware over and over again — something SpeedCurve really helped us with. Our mobile web experience performed in a way we wanted all our customers to enjoy, and as it was an internal application, we could add the same user timing marks to do a direct comparison with our main booking flow. We adopted these metrics as performance targets.

Then, we could start. We began to compare the web apps, like the length and depth of the waterfall, the overall page weight (of JavaScript, HTML and other assets) and the number of requests to start to build a backlog of performance improvements, ultimately so all of our customers could benefit, not just mobile customers.

In summary, we had the luxury of another web app that we could use as a benchmark and to set performance goals against. In the absence of this, you might consider benchmarking against your competition and using that as a goal, or by setting a target improvement such as a percentage for a performance metric.

Check back in the future for the second part of this series where we’ll introduce some of the improvements we made and the impact this had on our performance metrics.

Check out Luca’s blog for 5 mistakes to avoid in optimising your web app performance.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK