Ask HN: Where can I see many examples of real companies' software architecture?
source link: https://news.ycombinator.com/item?id=30986893
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Ask HN: Where can I see many examples of real companies' software architecture?
Ask HN: Where can I see many examples of real companies' software architecture? 349 points by PeledYuval 9 hours ago | hide | past | favorite | 89 comments I want to broaden my horizon regarding how things are solved in the real world. Other than some very high-profile companies (like Netflix, github) and companies that I've worked at, it's hard for me to find easily digestible (20-60 mins) examples of actual working architecture of differently sized companies from different business verticals.
http://aosabook.org/en/index.html
A good example is Scalable Web Architecture and Distributed Systems by Kate Matsudaira:
At what level of scale might one expect to need what's going on in the "Edge Cluster", as opposed letting all the requests fly right into the app servers?
I work for a very large organization (~£6bil in revenue, £700mil in profit last year) and we suffer from the "mud" problem - nothing about our technology stack is particularly special, it's just a hodgepodge of many different technologies that struggle to work together. That's not entirely fair - I work within a very unique solution inside of this firm, but I'm in a very unique position and I'm sad to say that it took a silly amount of hard work just to be able to not work on legacy applications.
That being said, the companies you mention (Netflix, Github) work completely differently - they were designed with tech in mind! They probably are much more lean in a technological sense, and don't suffer from enterprise architectural issues that large legacy firms do.
I suspect that this inability to move has singlehandedly killed more than one company, though I haven't studied the market to the point that I could really name any. The real kudos has to be given to large companies that existed before the internet and were able to move away from their slow-to-adapt, horribly inefficient legacy systems.
The interesting part is how larger scale makes things fail more often, and the response to increased failure can either be running around with your hair on fire for years, or a solid firefighting team, or actually teaching teams not to build products that catch on fire. The only way to get the last one is by focusing on people, not technology.
People assume that software architecture is like building architecture, in some ways it is, but NO ONE has ever showed up to a construction site that was half way done and said "Hey guys the steel framing we ordered has been delayed so please continue building the rest of the building by replacing anything that was original designed for steel beams with bamboo.
I think the building construction analogy might be similar to a home which has seen multiple remodels.
Now imagine a scenario where you have an absentee owner with a lot of money, a permanently staffed architect and a bunch of extremely able, slightly competitive contractors all on staff - each trying to prove their annual salary.
The original one story building would quickly become an ten story nightmare of a building.
You have a reasonable idea of what you want to achieve, and it looks good on paper, but until you have actually walked through it, touched and felt it, you don't know whether it really does what you wanted.
And of course, it changes as it matures, and what was great at first can become overgrown and resource-hungry.
Some choices by sales, some by engineering, some by management. All doing their best. Each reasonable on a sufficiently small time horizon.
For example one of the most popular article on that site (which is part of their book now) is the article on Netflix. A lot of that was cribbed directly from my talks, but they never reached out to me to even check it over, and as such missed a lot of nuance and detail, things I didn't cover in my talks.
Same thing for the article about reddit -- also cribbed a lot from my talks.
It's a fine overview, but light on specifics. I've reached out a few times and some things have been corrected after the fact, but I don't know if the other articles have been reviewed.
So my point is, be warned that the articles on that site are not primary sources but are derived from them.
Their post about Tumblr's architecture [1] focused a lot about JVM-based services, HBase, etc which in reality was only ever used for a tiny subset of the backend. The huge section on "Cell Design for Dashboard Inbox" was especially ridiculous: the systems described there were literally a mix of complete vaporware and failed/canceled projects that never even got close to production.
As an early Tumblr engineer, I was really upset to read this nonsense. I spent several months of my life working very long hours to successfully scale the existing (PHP/MySQL) dashboard activity feed architecture in 2011-2012. It continued to be used as-is for many years after this interview, with lower latency and much lower cost than the proposed hbase/scala cell replacement.
And of course, engineering candidates being interviewed would always ask about this hbase cell architecture thing that they read about in High Scalability...
[1] http://highscalability.com/blog/2012/2/13/tumblr-architectur...
It stands out because it is quite hard to find examples of this level of detail about such a large scale distributed system which aren't internet / web tech companies.
https://www.judiciary.uk/wp-content/uploads/2019/12/bates-v-...
So I think this doesn't meet your requirements, but I like Tech Dummies Narendra L's YouTube videos [0]. He introduces big tech companies' systems in 30-60min videos and it's not difficult to understand.
[0] https://www.youtube.com/playlist?list=PLkQkbY7JNJuBoTemzQfjy...
I'm not sure all he says are correct, but at least he uses the target companies' engineer blogs, external articles, and some open-sourced part of systems (and list them in the video's detail section). His main targets are often big techs like Twitter, Uber, and Netflix, so I guess such documents are often available.
One thing I don't remember explicitly called out, is that most all architectures are grown. There are scarily few situations where starting with the complicated idea is a good idea.
> A complex system that works is invariably found to have evolved from a simple system that worked. The inverse proposition also appears to be true: A complex system designed from scratch never works and cannot be made to work. You have to start over, beginning with a working simple system.
Something like "IT Architecture for the Forbes 500-thousand"
Unreal is source available too, if game engines are of interest to you
The best I found was the German contact tracing app — Corona Warn App. It was done by a group of consultancies in collaboration with the German govt, and went from inception to launch in around fifty days — largely if not totally open source.
Here’s the repo that has all the architecture in: https://github.com/corona-warn-app/cwa-documentation
It’s got full git history so you can see it evolve over time, along with the implementations (also on Github).
There’s a pretty fascinating short talk by one of the people who led the project on youtube too — more about the process side though: https://youtu.be/5y1sHSkPWRg?t=1770
The purpose of each episode is for anyone to walk away having a reasonable understanding of why and how a company built and deployed their app with XYZ technologies without needing to know anything up front. There's over 100 different companies / individuals who were on the show.
I tried to make it as efficient as possible to get these details. There's a lot more detail than a few bullet points but it doesn't get super lost in the woods with a million low level details that's specific to 1 company. It's basically an hour or 2 conversation for each episode where we cover everything from building to deploying their app, lessons learned, etc..
only half joking
Like most things in software engineering, it's qualitative and empirical - but also has very strong potential to function as a supporting "first principles" theory for so many things.
Conway - "How committees innovate"
http://www.melconway.com/Home/pdf/committees.pdf
I think this paper has a fantastic corollary in Peter Naur's "Programming as theory building" which triumphantly explores the implications of institutional knowledge in long term software maintenance. https://pages.cs.wisc.edu/~remzi/Naur.pdf
https://docs.sourcegraph.com/dev/background-information/arch...
It has link to many other articles and tech blog, besides having a lot of great info on system design and arch
All the systems design resources I can find are aimed at L4/L5, where the focus is e.g. on how to implement a rate limiter on a single machine, or at best saying you can distribute it by putting the counters on a cache server.
I'm trying for L6 and can identify many of the issues with a L5 design (redundancy, sharding, global latency, hot spots, local batching), but it's hard not to miss the obvious, and to offer practical/realworld solutions, when my day job is embedded compilers and not large scale systems.
This is mostly a rant but I appreciate suggestions.
https://martinfowler.com/ but I'm not sure if he touches the real world sometimes, it all feels very academic rather than pragmatic.
I wonder if you'll find "good" outcomes though, it seems to most startups or companies bumble their way to an architecture that works for them. It might not be correct but it might be best way to build a company without architecting everything too much up front.
Just curious, last time I happened to read anything from Thoughtworks was quite a long time ago.
If I’m curious how a company did something, searching for their job descriptions can turn up interesting stuff like what languages and frameworks they use, and often from there you can infer what their architecture might look like.
Oh and I’m pretty sure I’ve seen GitHub Enterprise too.
Blobs of code. It's hard to see the systems level.
I think there's a startup idea in there.
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact
Search:
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK