6

A technical question about the qntm.org URL schema

 3 years ago
source link: https://qntm.org/schema
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

A technical question about the qntm.org URL schema

2013-05-03 by qntm

Hey everybody.

So, you may or may not know that I designed qntm.org to have extremely terse, human-readable, human-memorable URLs. The full rationale behind this design decision can be found in my article from a few years ago, On short URLs.

One of the reasons given for this scheme is:

Because each page has a short, unique "slug" (e.g. in "http://qntm.org/destroy", the slug is "destroy"), each page has a unique URL. There is no redundancy in my URL schema - i.e., it is not possible to reach the same resource via multiple URLs. This keeps my site's complexity to a minimum and I'm told it's also good for SEO, although I honestly pay very little attention to the latter.

Here's what happens, then, when you try to load various URLs:

Now, new content on qntm.org is made available via an RSS feed which is in turn accessed by many feed readers. It was recently brought to my attention that the feedly feed reader was beginning to append query strings to the end of URLs when users visit them. For example, although the announced URL for my review of the movie Oblivion was

http://qntm.org/oblivion

, the URL that users of feedly were sent to when they clicked on that link (or however it is that feedly works) was

http://qntm.org/oblivion?utm_source=feedly

, which, as described above, results in a 404 response code. Unfortunately, this casts qntm.org as an unreliable website, which is undesirable. The intended purpose of these extra keys and values in the query string is outlined here. In theory I could gather up this information and use it to carry out analytics on my site usage, but I don't actually care.

This behaviour was brought to feedly's attention by a reader, and feedly added qntm.org to a whitelist overriding this behaviour. But the question remains.

Is it normal to add an arbitrary query string to the end of an existing URL and expect the modified URL to still locate the same resource? Is qntm.org's current behaviour in all four of the above cases technically correct? What are the pros and cons of, for example, serving a 301 Moved Permanently redirection to the canonical URL, instead of the 404?

Update 8 May 2013

I've implemented the suggested change of serving a 301 Moved Permanently to people who append random query strings. I also did some plumbing, so please let me know if you notice problems with the way that qntm.org is working.

Update 14 May 2013

Some backend changes are coming, none of which should (ideally) change the way that qntm.org functions other than to improve performance. I'm building some nominal testing procedures but please let me know.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK