38

Old Code Gets Younger Every Year

 3 years ago
source link: https://medium.com/@bellmar/old-code-gets-younger-every-year-3bd24c7f2262
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Old Code Gets Younger Every Year

Jun 8 ·11min read

The threat of decaying technology looms while we waste time talking about mainframes.

FnYjuq6.png!web

Photo credit Simon Claessen . Want an “all software is garbage” sticker? I’ll send you one for free.

Without fail whenever I get an interview request, or an invitation to speak about my work doing legacy modernization everybody wants to talk about mainframes and COBOL. The assumption is that I will tell some good war stories about the drudgery of old systems for other engineers who don’t need to worry about that stuff because their careers are focused on modern technology.

Granted, when I started working with legacy systems I was also drawn to the Ripley’s Believe It or Not factor of the most ancient programs. The thrill of unearthing and dissecting older and older systems, figuring out forgotten languages that most programmers have never heard of, let alone interacted with. I have always been fascinated by low level languages and systems, the magic that turns changes in voltage to abstractions in math and design. But lately I’ve become much more interested in the coming legacy apocalypse and how to slow down the rising level of technical debt on new(-er) technologies.

The legacy apocalypse is not the death of the last Baby Boomer COBOL programmer. To be honest that crisis has come and gone. When people talk about the threat of old systems, they love to trot out a stat about how old COBOL programmers are. For example, in 2006 the average age of a COBOL programmer was 55. That sounds bad. Lots of critical staff are close to retirement! Who will look after their systems when they’re gone?

Averages can be misleading. In the same survey 52% of programmers were 45–55 and 34% were 35–45 . But more to the point, eight years later when all those 55 year old programmers were supposed to have retired Micro Focus’s survey of COBOL programmers and executives put the average age of a COBOL programmer at 55 again . Their 2019 survey had the average at 50 .

In fact, the average age of COBOL programmers has been steady for decades . When my father worked on Y2K bugs he was in his late 40s~early 50s. His colleagues were similar ages. Every time I see people making a big deal about the age of the COBOL community I think of something American oboist Blair Tindell wrote about the classical music community:

The terror about older listeners was misplaced, ignoring the fact that average audience age has hovered in the late forties for some time. It was logical for people to wait until midlife to begin attending the symphony. With children grown, tuition paid, more leisure time, concerts fit well into mature baby boomers’ rich lifestyles, tastes, and income.

A similar thing could be said about COBOL. Unlike young programmers in the 60s, 70s and 80s, young programmers of today do not have university mainframes to play around with. If the university still has a mainframe it’s the work horse of the administration, too critical for student projects. Young programmers do not have the option of learning COBOL. And even if they did the hundreds — or some say thousands — of COBOL jobs are not entry level ones.

In all likelihood the reason the average age of COBOL programmers is stable is because COBOL programmers develop their depth of experience and expertise in other languages before moving over to COBOL later in their career.

People are worried about old COBOL programmers because they assume when the last of the COBOL programmers die out then their programs will be unmaintainable. This is a reasonable concern, however most people would be surprised to learn that the threat of unmaintainable legacy code is a whole lot closer than they think and does not involve a mainframe.

64% of Java Applications are stuck on Java 8

If you are keeping score, the most current version of Java is 14. The end of life for Java 8 was supposed to be 2019.

Java 9 introduced some structural changes to make Java more modular and therefore more feasible for embedded systems. Moving from Java 8 to Java 9 is not an upgrade, it’s a full migration. Among other things Java 9 made JDK-internal APIs inaccessible, it removed several tools and methods, and the shift to a modular structure required changes to dependencies. In other words, moving from Java 8 to Java 9 potentially meant that a lot of code would have to be rewritten.

So as a result more than half of production applications surveyed by Synk in 2020 were still running on Java 8.

Python 2

Of course the ultimate in upgrades that are really major migrations is the transition from Python 2 to Python 3. As with Java 8, Python 2 has lingered because migration to Python 3 requires both a rewrite of the code you own and eliminating Python 2 from all of your dependencies. Although tools like Benjamin Peterson’s six has made the task much more pleasant, dependencies are more than packages and libraries. The platform the code runs on is also a dependency and the platforms have been slow to respond. Although Python is an extremely popular scripting tool, AWS Lambda did not support Python 3 until 3.6 in 2017, which was a year after 3.6 was released. The same year that Salt rolled out Python 3 support. Ansible supported it a year later, roughly a decade after Python 3 was originally announced.

It’s hard to say how much Python 2 is left in the world. Jetbrains estimates that it’s only 10% and with 24 thousand respondents across 150 different countries that’s probably an accurate figure. The Python 2 problem may not be that there is so much of it, but where it still exists. According to Jetbrains the places where Python 2 is still giving Python 3 a run for its money are DevOps/Automation, Testing, and Network Programming. Getting various flavors of Linux to commit fully to Python 3 proved to be a huge challenge. And the fight is not over yet, every Mac-loving pythonista knows that Apple computers still ship with Python 2.7 as their default Python version because MacOS internal tools.

Everyone Hates jQuery and Yet It Is Everywhere

On the flip side of dependency hell is jQuery. Migrating away from jQuery is not difficult because of dependencies, it is difficult because so many other things have come to depend on jQuery.

When Twitter Bootstrap finally removed jQuery as a dependency in 2019 it was only because they copied and pasted source code from jQuery directly into Bootstrap . Even then the whole project took over two years from beginning to end.

jQuery is a victim of its own success. Its simple syntax proved so popular that other frameworks and even native JS started to adopt it. On top of that, many of the legacy technologies that jQuery provided cross compatibility with have finally been decommissioned (looking at you Internet Explorer). Personally I think the concern around jQuery is a little overblown, but I’m not a JavaScript person. The campaign against jQuery seems to have been kicked off by conflicts between the framework and the ascendant MVC JavaScript framework of the moment, React.

But like all holy wars in technology, good sensible arguments against choosing one option over the other become murkier the more often they are repeated. In some ways I think the jQuery story is the most similar to the COBOL story in that the headlines published about it lead with its omnipresence and imply that because other technologies can do the same things now those other (newer) technologies must be better.

Depth, Not Age

There are many things that make legacy systems difficult to maintain. The age of the programmers doing the maintaining is not one of them. True, loss of institutional memory matters and when the programmers who knew the system best leave institutional memory goes with them. But this is not an issue exclusive to older technology. Organizations lose institutional memory to poaching staff just as often as they do to retirement (probably more often).

The fact that the pool of available engineers savvy with COBOL is limited is a problem solved much cheaper and much easier by building pipelines to develop COBOL talent. IBM has been very active in this space with their Master the Mainframe program. It’s simply not true that COBOL programmers are a finite resource that is drying up.

I have to say in my experience, whenever a COBOL system goes down it is almost never the COBOL that took it down. I’ve seen hardware failures, issues with non-COBOL systems that support or otherwise integrate with COBOL, I’ve seen delays in adding new features because the COBOL code is poorly documented and engineering needs to figure out how to change it…. but I haven’t seen many incidents where the fact that the system was in COBOL was a problem in and of itself. That’s not to say that there aren’t good reasons to get rid of COBOL, there definitely are. I’m just not inclined to agree that civil society can’t continue to run on millions of lines of COBOL for another 60 years. It certainly can.

Java 8 and Python 2 on the other hand are a far more serious threat. When systems can’t get off end of life technology they miss security updates, performance enhancements, and new features. The longer systems stay stuck in their own technical debt, the more things are built on top of them, the more entrenched the legacy becomes.

We do programmers a disservice when we act as if the conversation about the growing threat of legacy code begins and ends with COBOL. A whole generation of software engineers are spending their careers making the problem worse by outsourcing all but the most unique aspects of their applications to armies of libraries, plugins and modules that they are powerless to monitor let alone update.

The real horsemen of the legacy apocalypse is the depth of the dependency tree. Modern software development stacks abstraction on top of abstraction. If the left-pad incident of 2016 proved nothing else it demonstrated that even experienced engineers will YOLO dependencies on to their applications if given the infrastructure to make installing them easy. Modern developer environments are a veritable candy store of cheap and convenient dependencies.

The Rise of Frameworks

If Wikipedia can be considered an authoritative source , activity around developing brand new programming languages peaked in the 90s when computers were accessible to a large number of people but were still relatively low on abstraction. The internet changed that both by making more complex distributed systems a reality but also by swelling the blast radius of security issues. Requiring better performance and better security made the MVP of a new language on modern day machines fairly complex. No longer can smart computer scientists build proof of concept pet languages and expect applying them to real world problems to power their evolution. There are a huge number of complex tasks programming languages are expected to handle for the programmer.

So even though the number of professional programmers has grown sharply since the glory days of the 90es, these software experts have shifted away from developing new languages toward developing new frameworks .

And a framework is essentially nothing more than a curated collection of dependencies given a common interface. True, frameworks make software development faster, but they also take away the developer’s ability to maintain their code. Advancements in tooling that decrease the speed of software development have inevitably deepened the dependency trees of the average software project.

Take for example Node.js. Node is an interesting framework that made it possible to run JavaScript on the server side, but it also introduced (as a dependency) a nifty little package manager called NPM. There had been package managers before and NPM wasn’t necessarily the best one, but it provided a better user experience by learning some lessons from the package managers before it. It installed things locally instead of globally by default. The command line was designed to integrate with a package repository from the beginning, so creating and publishing new packages was arbitrarily easy.

As a result the average depth of the dependency tree on NPM is 4.39 packages, while the average depth on a comparable package manager (in this case PyPi) is 1.7 . Python developers are not inherently more responsible than JavaScript developers. JavaScript’s lack of a good core library and its history as a tool language designed and implemented in a week makes it ripe for the development of frameworks to smooth it’s rough edges. There are many many npm packages that do small stupid things that in other languages there is a built in function. NPM made it easy to share.

zem2ieJ.png!web

Package dependencies NPM -vs- PyPi . Truly scary numbers

But what would happen if ECMA decided to fix some of JavaScript’s shortcomings the same way Java 9 and Python 3 attempted to resolve structural problems with their languages? Around 60% of packages on npm have not been updated in a year or more. Despite the lack of maintenance these packages are still downloaded billions of times.

A reality that ECMA acknowledges in their One JavaScript policy :

But how can we get rid of versioning? By always being backward-compatible. That means we must give up some of our ambitions w.r.t. cleaning up JavaScript: We can’t introduce breaking changes. Being backward-compatible means not removing features and not changing features. The slogan for this principle is: “don’t break the web”.

We could debate the merits of forever backwards compatibility all day. The point is the colossal dependency footprint that has always been inherent in JavaScript has grown infinitely worse as frameworks for it become more popular. So the same tools that are ironing out the numerous structural problems with a language like JavaScript are also making those problems impossible to fix in newer versions of JavaScript.

When we talk about maintaining healthy and secure technical systems long term this is a far greater threat than the age of COBOL programmers. And yet when we talk about legacy, we do not talk about these issues.

In Summary: Strategy Over Speed

Dependencies are a necessary evil, but using them doesn’t have to condemn projects to legacy hell. We need to start incorporating long term maintenance goals into our conversations about technology selection. JavaScript frameworks create deep dependency trees, yes, but even though NPM was developed to serve the needs of a backend language, 80% of the activity on it is frontend related . We throw frontends away and rebuild them all the damn time. The prevailing wisdom in the design community is that websites are redesigned roughly every three years. So a React frontend with a large dependency graph is less of a concern from a legacy modernization standpoint than a Node app with a dependency graph of the same size buried more deeply in the architecture.

In other words we need to start thinking critically about how long we expect a given piece of technology to last and ask ourselves whether the choices we make in building it will make it harder to remove later. We can no longer afford to wait and see when something better comes along. We have to assume that something better will come along eventually .

Finally we need to refocus the conversation and stop demonizing technology just for being old and programmed by old people . A huge portion of the world’s COBOL is doing just fine on COBOL. The problems that do exist can also be found in webapps built in 2002. The fact that COBOL is old is besides the point and distracts away from the growing ecosystem of code that is past end of life.

732mQni.png!web

Postscript

One of my fun pandemic projects has been creating slightly trolly engineering stickers. If you’d like a free “all software is garbage” tell me where to send it here .


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK