0

Raising availability in payments: a coaching perspective on performance

 2 years ago
source link: https://medium.com/paypal-tech/raising-availability-in-payments-a-coaching-perspective-on-performance-923a8189469
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Raising availability in payments: a coaching perspective on performance

Photo by Bruno Nascimento on Unsplash

In May 2021, PayPal’s payments platform reached a stable availability of 99.997% and 99.999% for our top 12 merchants, beating our goal six months ahead of schedule. This achievement required 18 months of focus and a broad set of skills. It also speaks to the maturity of PayPal’s payments platform. Here is why it is important and what led the team to succeed from the perspective of a sports coach.

Availability is paramount for a payments company

1. The law of big numbers

Customers come to PayPal to do business; buy something, pay someone, send money to a loved one. Out number one mission is to ensure our services are always available. Our objective in 2020 was to reach 99.99% (from 99.91%), and this is where we landed at the end of a very eventful year in the payments space. With PayPal processing billions of transactions per year, we want to ensure no customer is let down or has a bad experience. Our goal for 2021 was to have at least 99.995% availability.

2. Responsibility toward all our brands

As of May 2021, our payments platform processes not only our branded traffic, but is the largest payments processor for Venmo with 100 percent of the bank traffic, a large portion of Xoom traffic, and a growing share of Braintree traffic. We now process close to three-fourths of payments for all PayPal subsidiaries. This number is meant to increase as we strive to show ourselves as ONE company and unify our stacks. We have a responsibility toward our family of brands.

The coaching perspective behind a team’s performance

As some in my entourage know, I am a passionate freediver and a freediving coach. Seeing how the reliability team operated across the Payments Product, Engineering, and the Site Reliability organizations was humbling. Through the lens of coaching, here are three reasons why the team was able to outperform and beat both the expectation and timeline:

  1. A well-balanced roadmap crafted like a well-balanced training season.
  2. The group played well as a team, with everyone excelling at their jobs.
  3. The 24 hours — Get at it every day.

1. A well-balanced roadmap is like a well-balanced freediving training season

Training ahead of the competition is a long game that easily stretches to a year or more. It is important to balance the training for the performance to peak at the right time.

Balancing our platform reliability roadmap is a lot like balancing a freediver’s training season. In the image, Stephane Tourreau from Annecy dives to 107m and becomes world vice-champion. His training has three main components:

1*BA3vkef2EmtM4oimwPNgKg.png?q=20
raising-availability-in-payments-a-coaching-perspective-on-performance-923a8189469
Stephane Toureau diving at -107m in Dean’s Blue Hole Bahamas — photo credit Daan Verhoeven

1. The fundamentals take up 70% of his time — Lots of swimming and diving. A good amount of body works out to create muscles and habits. It helps him move effortlessly.

2. Perfecting the techniques: Takes up another 25% — Perfecting fin kicking, adjusting the posture to minimize friction, and optimize hydrodynamics, perfectly timing the equalizing steps to counter the crushing pressure from tons of water above him.

3. Mental preparation is the critical tip of the iceberg with 5% to 10% — Mindfulness and relaxation help him find extra resources when his body is telling him he’s reached his limits.

The reliability roadmap is similarly structured:

  1. Fundamental of base availability, in blue, are basics done right and takes up the lion’s share. Every piece of our stack (from infrastructure, to network, to third-party application, to our own application) needs to be healthy and in good shape.
  2. The technique of knowing our business comes next in yellow and orange. We perfected resiliency by diligently forcing resilient retries for every type of payment.
  3. The brain in stand-in is the last resort in green below. It does the math in real-time and decides how far to stretch when everything else is failing.

In the following image, you can see how each layer contributes to our availability:

1*LGLilfnQ6v9li6vxDeW9xw.png?q=20
raising-availability-in-payments-a-coaching-perspective-on-performance-923a8189469

2. Coaching a team, not a player

“I don’t coach Ronaldo on how to do a free-kick, I do not coach players on how to play soccer, I coach them on how to play with the team”Jose Murinho, winner of the Champion’s League in four different countries.

In retrospect, when I look at the KPI tree as well as the roles and responsibilities we’ve put together a year ago, it does look like a soccer team formation. Not only did everyone excel at their roles, but the team played well together, with everyone being clear with their own objectives.

The defenders holding the fort and getting the basics right:

  • Partner availability — Occupied by the BizOps team. They actively managed our partners and elevated their availability to 99.98%.
  • Base availability — Occupied by the Payments Platform Planning team and Site Reliability Engineering team. Objective to stabilize it at 99.97% and then up to 99.98%
1*yE7x8xD3Nu-JMT1k0Rozlw.png?q=20
raising-availability-in-payments-a-coaching-perspective-on-performance-923a8189469
KPI Tree and team formation

The midfielders and second line of defense:

  • Redundancy — When a partner is down, we retry with an alternate route. Payments Authorization and Capture team closes redundancy gaps as identified.
  • Resiliency — When an application or a supporting piece of our infrastructure is down, we are diligent at trying a different path a second time. That role is occupied by the Payments Switch In-Bound team.

Lastly, the striker. Stand-in platform scores to make the difference. It bumps our availability from four 9s to tilt it above the goal at 99.995% and beyond. The striker's squad is the stand-in product, engineering, and program management team. It requires building a parallel payments stack when the main one fails. This parallel stack shares no failure points with the main one, but functions in a way that is transparent to our customers and merchants.

3. The 24 hours rule — Get at it every day

Win or lose, after 24 hours you get over it and you move on to the next day” — Dawn Staley, head coach for the United States basketball team. Basketball hall-of-famer.

Availability is an everyday game with ups and downs. What made the team successful is that beyond the roadmap and the work planned, the team would religiously look at the numbers every single day and determine how to react to them. Every event is an opportunity for learning, finding gaps, and plugging the holes.

With these hundreds of little improvements, the foundations grew more robust. As our partners grew more reliable, our stack did too. That accumulated value is demonstrated best in our base availability jumping from 99.91% to 99.97% in a year.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK