42

Archiving 26 500 community Q&As from Ask Fedora

 4 years ago
source link: https://www.tuicool.com/articles/ai63Ub6
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Ask Fedora is the Fedora Linux community’s questions-and-answers portal, and it recently transitioned from a forum software called Askbot to Discourse. Changing the underlying forum software doesn’t have to be destructive but Ask Fedora decided to go with a nuke-and-pave migration strategy : They decided to start from scratch instead of copying user accounts and the user-contributed content to the new software.

The first time I learned of the migration was a few days after it had happen. I’d run into an issue with my Fedora installation and went online looking for solutions. Every useful search result was from the old Ask Fedora site and every link returned an HTTP 404 Not Found error message as those answers hadn’t been migrated to the new Ask Fedora website. None of the pages I were looking for where available in the Internet Archive either.

The Fedora Linux community have lost 26 500 questions and the community discussions surrounding those questions with Ask Fedora’s software migration. I don’t have any numbers for how many answers and comments each question had received, but I’d guesstimate an average of more than one answer per question. 25 000 of the questions where in English, 680 in Spanish, and the rest was split between Portuguese, Russian, and a couple of other languages.

Askbot had its issues, especially when it came to performance, and I’m sure there were plenty of good reasons to migrate to Discourse. That still doesn’t excuse not preserving all the work that people had put into answering questions on the old Ask Fedora platform over the years.

Ask Fedora is run and maintained by volunteers and it’s other volunteers still who had answered the questions posted by all the novice and experienced Fedora users who’ve needed some help over the years. Many top contributors on Askbot had amassed thousands of “internet points” and recognition on the old Askbot platform. I’m sure deleting all their hard work and asking them to start from scratch must have pushed at least some of them away from contributing on the new Ask Fedora website.

Whenever you upload or write anything on someone else’s platform, the faith of that work is left up to the operator of the platform. This shouldn’t come as a surprise to anyone — least of all me — but I’m still disappointed to see all the time, effort, and answers people had put into the site just vanish over night. As you might be able to tell, I’m not happy about seeing my 140 answers and 1400 karma points being discarded over night.

However, the old Ask Fedora website does actually still exist! There is a read-only archived clone of it on askbot.fedoraproject.org . This doesn’t help much as you have to know that you have to manually, more on that later, go to that domain instead of ask.fedoraproject.org to find the questions and answers from the old site. The old domain will probably remain in search results for months to come and references to them will remain throughout the web and in people’s bookmarks forever.

The old Ask Fedora website explicitly licensed all user-submitted content under the permissive Creative Commons Attribution-ShareAlike 3.0 License. That licensing scheme allows for anyone to make copies of and distribute, and preserve every discussion on the platform in their entirety.

I jumped on the opportunity and submitted all 26 500 questions and answers from the old Ask Fedora website (using the askbot.fedoraproject.org domain) for archival in the Internet Archive’s Wayback Machine . The whole process only took my computer a couple of hours.

You can access the archived version in the Wayback Machine . Please note that retrieval can take a minute or two after the page has loaded.

I really wish someone had done that before Askbot was moved to a new domain so that the old URLs could have been preserved. At least it should now be possible to dig up all the old questions and answers given that you know that they can be accessed from a new location. (Which is a big ask without any redirects.)

This archival trick isn’t going to work again in the future. Discourse pages are too dependent on JavaScript for the Internet Archive to properly archive their contents. Many other JavaScript-heavy websites manages to produce a HTML versions of their pages for bots and archiving but this have been a known-issue with Discourse for years. The new Ask Fedora portal also doesn’t explicitly license new user contributions under a permissive Creative Commons license.

I understand that migrating content from one platform to another can be difficult and time consuming. However, caring for the old URL design and redirecting them to an off-site archive isn’t a huge task. I couldn’t find any URL collisions between the old and new URL designs, so setting up redirects should have been a trivial task.

Unfortunately, Ask Fedora isn’t hosted on Fedora’s own infrastructure and is instead hosted and operated by Discourse. They don’t offer good enough tools to forward the old URLs to either the archived version at askbot.fedoraproject.org or the Internet Archive’s copy. Losing the ability to deploy something as routine as a couple of redirects is a huge argument against outsourcing in my book.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK