2

Looking at Parler specs and their architecture

 3 years ago
source link: https://ayende.com/blog/192993-B/looking-at-parler-specs-and-their-architecture?Key=72a0188f-60da-476a-9d83-8b136b83b7ed&utm_campaign=Feed%3A+AyendeRahien+%28Ayende+%40+Rahien%29
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Jan 21 2021

Looking at Parler specs and their architecture

time to read 7 min | 1238 words

I run into the following twitter, which list some of Parler’s requirements (using the upper limits specified):

  • Scylla cluster – 40 nodes with 64 cores, 512GB RAM, 14TB NVME drives for each node. For a total of 2,560 cores and 20TB RAM, 560 TB of disks.
  • PostgreSQL cluster – 100 nodes with 96 cores, 768 GB RAM and 4 TB NVME. For a total of 9,600 cores, 75 TB RAM and 400 TB of disks.
  • 400 application instances – 16 cores & 64 GB RAM.

Their internal traffic is about 6.6 GB / sec and their external traffic is about 2 GB / sec. There is a lot of interesting discussion on the twitter feed on these numbers, but I thought that it would be interesting to see how much it would cost to build that.

The 64 Cores & 512 GB RAM can be handled via Gigabyte R282-Z90, the given specs says that a single one would cost 27,000 USD. That means that the Scylla cluster alone would be about a million dollar, but I haven’t even touched on the drives. I couldn’t find a 14 TB NVMe drive in a cursory search, but a 15.36TB drive (Micron 9300 Pro 15.36TB NVMe) costs 2,500 USD per unit. That makes the cost of the hardware alone for the Scylla cluster at 1.15 million USD.

I would expect about twice that much for the PostgreSQL cluster, for what it’s worth. For the application servers, that is a lot less, with about a 4,000 USD cost per instance. That comes to another 1.6 million USD.

Total cost is roughly 5 million USD, and we aren’t talking about the other stuff (power, network, racks, etc). I’m not a hardware guy, mind! I’m probably missing a lot of stuff. At that size, you can likely get volume discounts, but I’m missing that the stuff that I’m missing would cost quite a lot as well. Call it a minimum of 7.5 million USD to setup a data center with those numbers. That does not include labor and licensing costs, I want to add.

Also, note that that kind of capacity is likely something that you can’t just get from anyone but the big cloud providers with a quick turnaround basis. I’ll estimate that this is a multiple months just to order the parts, to be honest.

In other words, they are going to be looking at a major financial commitment and some significant lead time.

Then again… Given their location in Henderson, Nevada, the average developer salary is 77,000 USD per year. That means that the personal cost, which is typically significantly higher than any other expense, is actually not that big. As of Non 2020, they had about 30 people working for Parler, assuming all of them are developers paid 100,000 USD a year (significantly higher than the average salary in their location), the employment costs of the entire company would likely be under half of the cost of the hardware required.

All of that said…. what we can really see here is a display of incompetency. At the time it was closed, Parler has roughly 15 – 20 million users. A lot of them were recently registrations, of course, but Parler already experience several cases of high number of user registrations in the past. In June of 2020 it saw 500,000 users registering to its services within 3 days, for example.

Let’s take the 20 million users as the number of users, and assume that all of them are in the states and have the same hours of activity. We’ll further assume that we have high participation numbers and all of those users are actively viewing. Remember the 1% rule, only a small minority of users are actually generating content on most platforms. The vast majority are silent observers. That would give us roughly 200,000 users that generate content, but even then, not all content is made equal. We have posts and comments, basically, and treating them differently is a basic part of building efficient system.

On Twitter, Katy Perry has just under 110 million followers. Let’s assume that the Parler ecosystem was highly interconnected and most of the high profile accounts would be followed by the majority of the users. That means that the top 20,000 users will be followed by all the other 20 millions. The rest of the 180,000 users that active post will likely do so in reaction, not independently, and have comparatively smaller audiences.

Now, we need to estimate how much these people will post. I looked at Dave Weigel’s account (591.7K followers), covering politics for Washington Post. I’m writing this on Jan 20, so the Biden inauguration takes place. I’m assuming that this is a busy time for political correspondents.  Looking at his twitter feed, he posted 3,220 tweets this month and Jan 6, which had a lot to report on, had 377 total tweets. Let’s take 500 as the reasonable upper bound for the number of interactions of most of the top users in the system, shall we?

That means that we have:

  • 20,000 high profiler users.
  • Each posting to a max of 500 a day.
  • Let’s assume that this all happens in 8 hours, instead of over the entire day.
  • That translates to roughly 1,250,000 posts an hour. If we express this in terms of posts per second, that comes to 348 posts per second.

Go and look at the specs above. Using these metrics, you can dedicate a machine for each one of those posts. Given the number of cores requested for application instances (400 x 16 = 6400 cores), this is beyond ridiculous.

Just to give you some context, when we run benchmarks of RavenDB, we run it on a Raspberry Pi 3. That is a 25$ machine, with a shady power supply and heating issues. We were able to reach over 1,000 writes / second on a sustained basis. Now, that is for simple writes, sure, but again, that is a Raspberry Pi doing three times as much as we would need to handle Parler’s expected load (which I think I was overestimating).

This post is getting a bit long, but I want to point out another social network Stack Exchange (Stack Overflow), with 1.3 billion page views per month (assuming perfect distribution, roughly 485 page views per second, each generating multiple requests).

  • Their web servers handle 450 req/sec at peak across 9 web servers (Max of 4,050 req/sec) with peak CPU usage of 12%.
  • 2 SQL Server clusters with 4 machines in total. Handling an aggregate of 23800 queries / sec with peak CPU usage of 15%.
  • Render time across the board of < 20 ms.

The hardware that is used for those servers:

  • 9 Web - 48 cores + 64 GB RAM
  • 4 DB – 32 cores + 768 GB RAM

There are a few other type of servers there, and I recommend looking into the links, because there is a lot of interesting details there.

The key here is that they are running top 200 site in significantly less hardware, and are able to serve requests and provide great quality of service.

To be fair, Stack Overflow is a read heavy site, with under half a million questions and answers in a month. In other words, less than 0.04% of the views generate a write. That said, I’m not certain that the numbers would be meaningfully different in other social media platforms.

In my next post, I’m going to explore how you can build a social media platform without going bankrupt.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK