37

How Microsoft rewrote its C# compiler in C# and made it open source

 5 years ago
source link: https://www.tuicool.com/articles/hit/ZZV3Mz2
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Project Roslyn

Roslynis the codename-that-stuck for the open-source compiler for C# and Visual Basic.NET. Here’s how it started in the deepest darkness of last decade’s corporate Microsoft, and became an open source, cross-platform, public language engine for all things C# (and VB, which I’ll take as a given for the rest of this piece).

The first conversations about what would become Roslyn were already ongoing when I joined Microsoft in 2005 — just before .NET 2.0 would ship. That conversation was about rewriting C# in C#. This is a normal practice for programming languages; a proof point of the maturity of the language. But there was a more practical and important motivation: We as the creators of C# were not programming in C# ourselves, we were coding in C++! Working in C# every day makes you think differently about C#: It’s the power of “dogfooding”.

Customers will depend on the new compiler behaving exactly the same way as the old one. Writing a new compiler for C# means trying to match the old compiler bug-for-bug.

The challenge of rewriting a compiler that has been in the hands of customers for several years is that those customers will depend on the new compiler behaving exactly the same way as the old one. Writing a new compiler for C# means trying to match the old compiler bug-for-bug. And I’m not just talking about known bugs, but those unknown and unintended behaviors that developers have found and come to rely on, often unknowingly.

For years, the sheer magnitude of this challenge kept us from even embarking on the project.

Also while there would be many benefits inside of the languages team for a new C# compiler written in C#, the value proposition to customers was more challenging: How would a new compiler help existing customers? Perhaps the only people that would care that C# was written in C# would be the members of the compiler team.

At the same time, though, another problem was growing bigger: the duplication of effort between different tools working over C# code. Besides the compiler our sister team was building the IDE support for C# in Visual Studio, and they also had to write reams of code (also at that time in C++) to understand C# syntax and semantics.

Beyond that was a growing number of tools from Microsoft and others, such as StyleCop, CodeRush, etc., all having to implement meaningful code-based tools starting with just flat C# source text. All of these would have subtly different bugs, different levels of understanding, different compromises and trade-offs. And all would expend lots and lots of effort just to get to square one: understanding the code.

Here, finally was our value proposition: Make it so that there only needs to be one code base in the world that understands C#, shared by everyone who wants to build tools over code!

Here, finally was our value proposition: Make it so that there only needs to be one code base in the world that understands C#, shared by everyone who wants to build tools over code! The customer value would follow from the increase in available tools, and especially in the quality of existing tools. We would put all the language correctness and performance demands on a single code base and expend the effort once to make it of stellar quality and massive versatility. We would build a language engine! A unified, public API to C# code: We would redefine the meaning of “compiler”.

Of course, once you are building an API for the broad C# community, it is kind of a slam-dunk that it should be a .NET API, implemented in C#. So, the old dream of “bootstrapping” C# in C# was fulfilled almost as an accidental side benefit.

Roslyn was thus born out of an openness mindset: sharing the inner workings of C# for the world to programmatically consume. This in and of itself was a bit of a bold step in what was still a pervasively closed culture.

R oslyn was thus born out of an openness mindset: sharing the inner workings of the C# language for the world to programmatically consume. This in and of itself was a bit of a bold proposition in what was still a pervasively closed culture at Microsoft: We would share this intellectual property for free? We would empower tool builders that weren’t us to better compete with us?

The arguments that won the day for us here were about strengthening the ecosystem and becoming the best tooled language on the planet. They were about long-term growth of C# and .NET, versus short term monetization and protection of assets for Microsoft. So even without having mentioned open source, signing up for the cost and risk of the Roslyn project was a big and bold step for Microsoft.

Of course, you don’t just build something like that. The Roslyn vision was highly ambitious and fraught with technical challenges as well, and it took us half a decade to fulfill it. But that’s a story for another day.

Ever since the project began in earnest in 2009 we had visions of making our compilers open source, but Microsoft simply wasn’t ready yet.

F or most of the time we were building the initial version, Roslyn was still a closed-source project. Ever since the project began in earnest in 2009 we had visions of making our compilers open source, but Microsoft simply wasn’t ready yet. The culture of developing in private and filing patents around your original code represented how Microsoft had been working since the 1970s — and while change was in the air, it was going to happen more slowly than our team had hoped.

In fact, for a while it felt like the company was going in the totally opposite direction.

The Windows 8 project pretty much took over the whole company. With its new programming model, its tentacles reached deeply into the developer tools and languages teams, and everything was blanketed in extreme secrecy, not just towards the outside but even within the company. As an example, the async feature we were developing at the time was coordinated and enmeshed with the Windows 8 programming model, and I did not dare publish design notes for it even internally, for fear of accidentally leaking information about Windows 8 and getting myself in trouble! This created a terrible climate for innovation, and it certainly did not bode well for our hopes of open sourcing the C# compiler.

E ventually, though, after Windows 8 had run its course, the company started to transform and found its new direction, towards new leadership and a very different core philosophy; the Microsoft we know today. The open source movement now rapidly started to take hold inside Microsoft.

F# released already in 2010 with an open source license and its own foundation — the F# Software Foundation . The vibrant community that grew up around it soon became the envy of us all. Our team pushed strongly to have an open source production license for Roslyn, and finally a company-wide infrastructure emerged to make it real.

By 2012, Microsoft had created Microsoft Open Tech; an organization specifically focused on open source projects. Roslyn moved under Microsoft Open Tech and officially became open source. It was a strong candidate for it: the development resources were all internal and well-known, and the project itself stood on its own without a lot of dependencies that might create licensing conflicts.

In April of 2014 at Microsoft’s “Build” developer event in San Francisco, Anders Hejlsberg showed off Roslyn as an open source project , and Roslyn was published on April 3rd through CodePlex (Microsoft’s since retired open source hosting platform) under an Apache 2.0 license.

vIFbUvz.png!web
Project Roslyn in CodePlex under Microsoft Open Tech

At the same time the.NET Foundation was announced as a home for .NET projects including Roslyn.

Being in the open was a magnificent breath of fresh air! Even as we started to reap the benefit of openness in CodePlex, the remaining procedural open source hurdles at Microsoft were straightened out, and today open source is a straightforward and integral part of how we work across many of our teams.

We no longer treat GitHub as a publishing venue — it is simply where we work.

Also on other fronts did the company realize that we didn’t need to control everything. It became clear that there was no good reason for CodePlex to be in the world, and Roslyn joined other projects in migrating from CodePlex to GitHub, by then the de facto home for open source projects. Not only the source code but the process of building it are all in GitHub: We don’t treat it as a publishing venue — it is simply where we work.

ue2iY3n.png!web
Roslyn on GitHub today

C # language design and compiler implementation are now completely open processes, with lots of non-Microsoft participation, including whole language features being built by external contributors. The value to C# is through the roof, not only through the scaling of effort via contribution of features and bug fixes, but also the insight and course correction we get through the instant, daily feedback loop that open source provides.

It’s been a long and wild journey, and one that to me is symbolic of the massive changes that Microsoft has undergone over the last decade. The nugget that was Roslyn started in the dark, grew on an idea of openness and exploded into a million different uses today through the power of open source.

Explore Roslyn for yourself, or pitch in to C# language design:


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK