3

How My Grofers Internship Came To Be

 3 years ago
source link: https://lambda.grofers.com/how-my-grofers-internship-came-to-be-4d8334221b9a
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

How My Grofers Internship Came To Be

And The Many Things That I Learnt From It

Image for post
Image for post
Design by Asif Jamal

In May 2019, I started my internship at Grofers — India’s largest online grocery retailer.

I was joining the infrastructure team that has some of India’s smartest and most capable engineers working on problems of scale.

Me? It was my first time working in a real, professional environment.

I was eager. I was scared. But most importantly, I was uncertain. Where will all this lead to? What will come out of it?

But before I dive into what all the internship was about, let’s rewind a little.

I had toiled and worked hard on my internship assignment that was about building a simple key-value store server and a client to consume the API.

With a lot of excitement, I submitted my application.

And then I waited, only to realize much later that there was a random bug in my college’s email server. While that isn’t doomsday, it did seem like one for me because my email got stuck in a “queued” state.

I tried a few more times, hoping that it gets cleared. I even used a different account to submit the assignment but failed.

So here I was. My assignment was in a queued state for my internship, and I can’t do anything about it.

But as with all the problems in the world, there is always a solution. I discovered that I had misspelled the company name in the subject. Too small for a “bug” that caused me such anxiety.

Of course, my college fixed the mail server, but because I had been trying to get my message through, the recruiters’ inbox was bombarded with all my queued messages.

Good start to the internship (or not), I thought.

But thankfully, it wasn’t a deal-breaker for Grofers. After an engaging interview, my application was shortlisted.

I was excited, albeit temporarily, to bid farewell to the monotony of college work and be a part of a fast and challenging environment.

Besides the usual first-job-nervousness, I was going to be working with people who worked with systems I was barely familiar with.

The day I joined, I went through the orientation / on-boarding process — some parts of which weren’t applicable to me. The same day I got my new laptop and formally started my journey as an Infrastructure intern at Grofers.

Right off the bat, the guys in my team were incredibly helpful and understanding. They were also patient with my fumbling and inexperienced self.

I was given a couple of small projects to allow me to get acquainted with the codebase and the tech stack.

I was given the task to make a small deployment to get used to Kubernetes and to make a simple program to tail messages from Kafka — a distributed message queue.

My first couple of weeks were a bit slow. I had to deal with quite a bit of access issues which were compounded by the fact that I was in the Infrastructure team.

If you are on the infrastructure team as an intern, there needed to be some restrictions on the potential damage I could do.

I often needed access to the servers but since I was an intern and was relatively inexperienced, it took some time.

But soon, it was all dealt with and I got my actual internship project: to create something to automatically load test Nyala.

Experimenting with Nyala at Grofers

What is Nyala?

Nyala is an asynchronous messaging system built on top of Kafka and is used for inter-service communication.

Its goal is to provide a reliable way to send messages between services while not compromising on reliability. It tries to do so by having mechanisms for ensuring ordering, retries, and back-offs.

Any words I muster up to describe Nyala will be inadequate, the legends of which echo throughout the walls of the Infrastructure team. When I initially started reading about its architecture and working, I was baffled.

To me, it seemed like it was needlessly complicated what I felt was a task that could be done more simply.

After some minutes of me firing questions to one of the guys in my team, I was, jokingly, told -

“You know, everyone who joins the team has the same problems you do, as soon as you stop questioning Nyala and accept it into your heart and soul, everything becomes a lot easier for you.”

In hindsight, he was at least partially right. As I started diving deeper into it, ignoring the sometimes over-engineered code, a lot of the design decisions began making sense.

The Nyala today is the product of 4 years of labor from different engineers working on to improve a flawed but functional codebase.

The fact is that despite any problems it might have, it plays a critical role in the workings of the Grofers technology architecture, and it usually does its job with few hiccups.

It is, therefore, essential to ascertain the limits of the stress it can handle to make sure it continues working smoothly, particularly in times like the GOBD, the bi-annual Grofers sale, which sees a massive spike in traffic and makes a batman curve.

To catch up, I began reading up more about load testing and the different tools which are usually used for it today.

It turned out that most of the conventional load testing tools used today, like Locust or Gatling, were unsuited for the task since they were primarily used to test synchronous HTTP services.

They aim to test high load scenarios by simulating actual user flows and requests that would result from that. We had no need to emulate user flows since all our user has is a single database or endpoint which is used to transmit messages.

What we needed was monitoring of each individual Nyala component. As it happened, a lot of the work there was done already.

Nyala has a monitoring system in place where the statistics about the performance would be logged and sent to a daemon program “statsd” which would collect and aggregate the stats and send them to a time-series database “Influxdb”.

The statistics could then be easily monitored by creating a dashboard in Grafana, a tool for monitoring and metric analysis.

My job for each load test was as follows:

  • Spawn a new Nyala environment in Kubernetes
  • Configure the setup as needed
  • Connect it to statsd and make sure the metrics it sends are unique and don’t get mixed up with the other existing Nyala setups
  • Send the load and watch statsd do the rest of the job.

As I was finishing up the project, I was given another task, to see if we could migrate the version of Kafka we were using for Nyala to a newer one.

For that what I had to do was to stream each message to two different Kafka versions, the older one in production and a new one we planned to migrate to. I then made sure that there would be no changes to the behavior of Nyala if we were to migrate.

Unfortunately, when I ran a load test on Nyala setups with each of the different Kafka versions, the performance boost we received was minimal.

However, with the ability to quickly run load tests in isolated environments, we can now test any optimizations we make to Nyala code.

Isn’t that cool?

Conclusion

I wished to avoid the usual platitudes which writing this, but this genuinely was an amazing and rewarding journey.

At Grofers, I got the opportunity to be surrounded by and learn from people smarter and more experienced with me.

I’m incredibly grateful to have had this opportunity, and I’m sure I’ll reminisce about my time here as one of fun and excitement.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK