

Getting to Know a Legacy Codebase
source link: https://www.tuicool.com/articles/hit/VJVVJn7
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Written by Harry Roberts on CSS Wizardry .
The other day, Brad dropped me a message asking me about the topic of getting to know a brand new (specifically CSS) codebase. The kind of codebase that no one person truly understands any more; the kind of codebase that’s had a dozen different contributors over just as many years; the kind of codebase that’s never had a full-scale refactor or overhaul, but that’s grown organically over time and changed with new techniques, styles, and trends.
(Un)fortunately, I’ve dealt with a number of these types of project over time, so I have a few tips about getting to know a codebase like this, and how to begin making sense of it.
See What Breaks
Honestly, to start with, I just identify a part of the codebase (it could be a large or important feature, a frequently-occurring component, something pertinent to current workstreams, etc.) and play around with it: bend it, break it, change it, see what happens to it.
Make a note of anything odd or unusual; if that thing happens again when you play around elsewhere in the codebase, there could be some significance to it.
Identify Your Entry Points
Try as much as you can to build up an idea of which files import which other files; who passes what onto who; what inherits from where; what does your dependency tree look like?
Once I can work out where files are bundled up and imported from, I can build a much better mental model of the topology of the codebase.
Get an Idea of What’s Been Changed
It’s often a good idea to see what files were changed last: any files that haven’t been altered in over a year are probably less important to current work. One (rather cumbersome, I admit) way to see that is with the following:
$ git log --name-only --since="365 days" | sort -u | awk '/\.(le|c|sa|sc)ss$/{print}' | nl
N.B.This output will not be listed chronologically.
This will log your codebase, list the filenames associated with any commits made
in the last year, it will then pass them off into sort
where we leave only
unique entries, before passing it into Awk and printing only .less
, .sass
, .scss
, or .css
filepaths, then finally passing it all into nl
to number
the lines of output.
This should give you a list of all the stylesheets changed in the last year.
Identify (Roughly) What is Used
Chrome’s Coverage tool is a great way to see at a very high-level glance what CSS is being used by the current page. This very non-scientific method of checking code usage should give you an approximate idea of what might be dead code, or what you might simply be able to ignore.
For a slightly more robust, but still imperfect method of identifying dead code, try this technique that I wrote upearlier in 2018.
Although very vague data, it should give some idea as to where to start looking.
Learn the Codebase Organically
Don’t go looking for trouble. Work around learning the features you need to work on immediately. If you never need to touch the help and support section, there’s no point going out of your way to research it. Try to remain focused and learn about the parts of the codebase that are most important and timely.
This is general good advice with any tech debt or refactoring problem: don’t increase your (cognitive) workload unnecessarily.
Set Up Regression Tests
You need to make sure you don’t inadvertently mess anything up while you’re looking around. There are two methods I would use here: the first is very low-fidelity and a little coarse; the second is far more involved.
If I’m playing around with Sass, or any other precompiled assets, I take a checksum of the compiled CSS file before I start playing, and then a checksum after I made any changes. If the two fingerprints differ, I know what whatever changes I just made wound up in the compiled assets. This kind of light-touch smoketest allows me to crudely figure out inputs and outputs. It can also help me to figure out whether a particular variable or mixin might be being used in the project: if I change a variable’s value, but the compiled CSS is identical, I could make a note that that variable might be unused.
harryroberts in ~/Sites/csswizardry.com on (master) » shasum _includes/css/csswizardry.min.css 7142689d6f79927f192577002f181aef5721d1bc _includes/css/csswizardry.min.css harryroberts in ~/Sites/csswizardry.com on (master) » vim css/_settings.colors.scss harryroberts in ~/Sites/csswizardry.com on (master) » shasum _includes/css/csswizardry.min.css ae2a4c94c945d61f1c009d4fb14c40fa19f20ddd _includes/css/csswizardry.min.css
Note that the two checksums differ: this means that whatever I changed in css/_settings.colors.scss
did make its way into the complied CSS. Although
I might not know exactly what or how far reaching that change was, I do at last
know that the thing I altered will affect the project.
For longer-term refactoring or auditing, it can be useful to set up a formal regression testing tool such as BackstopJS . This will alert me if I do any severe, customer-facing damage to the UI. Ideally, we’d have a regression testing suite in-place permanently, so this is as good a time as any to start.
Learn How to Use Parker, ack, and ctags
Parker
is a wonderful static analysis
tool from my buddy Katie
. You can use it to grab
useful information about the overall make-up and health of your CSS: how many
IDs, mean specificity, how many !important
s, etc. Learn more about Parker with
my post all about demystifying its
output
.
ack is an incredible little command line utility that serves as a more useful, sensible, and user-friendly alternative to grep. You can use it to quickly search entire codebases for specific strings using regex. As an example:
$ ack --css "\d*\.\d*px"
This will search every CSS file for pixel values that contain decimals (e.g. 24.1px
). ack is incredibly useful, so I’d recommend reading my
Ack for
CSS Developers
post for some more practical examples.
If you’re feeling adventurous, install ctags . Although a little old school, ctags can be incredibly powerful on all manner of projects. Wikipedia defines ctags as:
…a programming tool that generates an index (or tag) file of names found in source and header files of various programming languages. Depending on the language, functions, variables, class members, macros and so on may be indexed. These tags allow definitions to be quickly and easily located by a text editor or other utility.
Basically, ctags will create a kind of of database of all of the function, variable, mixin, ID, and class definitions for your project. Text editors that support ctags can then use this information to begin linking up disparate parts of the codebase for you, showing you where particular code is originally defined.
For me, as a Vim user, I generate my project’s tags file from the project’s root like so:
$ ctags -R
This will -R
ecursively look through my project and make the database of its
references. Then, once in my text editor, hitting Ctrl-]
with my cursor
over—for example—a Sass variable will jump me to the exact place that variable
was defined.
Git on Top of Things
Learn some advanced Git logging, showing, and blaming. The entire project’s history is contained here, and if you know how to interrogate it effectively, you’ll be able to build up a really great context of how things came together.
Reach Out to People
Your extensive detective work in Git will no doubt yield a number of different names: if these people are still in the organisation, use them. Ask them to help clarify things; see if they remember the events or decisions surrounding certain features, etc.
Recommend
-
9
5 Tips to Find Your Way Around A Legacy Codebase Published June 15, 2021 - 0 Comments
-
7
An approach to refactoring a legacy codebaseDesign by Kumkum JainLegacy code clean up and movement is like spring cleaning your home. It eliminates cl...
-
9
Supporting the Elderly: Tips for Working on a Legacy PHP CodebaseJanuary 11th 2022 new story4
-
5
How to Maintain a Legacy CodebaseFebruary 10th 2022 new story8A lot of your day-to-day job...
-
7
I wanted to make a recap of the most common and useful advice one can give to tackle Legacy codebases. I have a few of my own, but I also went out to ask experimented developers about their opinions. I ended up with 7 advice....
-
6
Rogue Legacy 2 Is Finally Coming. Here's What We Know So Far
-
8
Hogwarts Legacy for PS5: Everything you need to know Hog...
-
7
The Subaru Legacy Is Getting Some Serious Upgrades
-
8
Microsoft is getting rid of the Remote Mailslot legacy protocol in Windows 11 soon...
-
5
Whether it's contributing to open source or starting a new job, the first step is familiarizing yourself with the codebase, and it can be daunting. Here are some tips to help you hit the ground running. 1. Don't Panic Even...
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK