Design the perfect URL

This article first appeared in issue 215 of .net magazine - the world's best-selling magazine for web designers and developers.

URL design has recently become a topic of discussion again over the past year. It started with Twitter’s autumn 2010 redesign, which seems to have validated what was generally considered to be a poor web design technique for public-facing websites: the ‘hash-bang’ URL.

These are URLs that, directly after the domain itself, start with ‘#!’ or ‘£!’ – for example, twitter.com/kurafire becomes twitter.com/#!/kurafire. The part of the URL that uniquely identifies the content of the page is then added at the end. This technique is aimed at improving performance – it’s essentially aimed at not reloading an entire page when you only need to reload a small piece of it. But it doesn’t come without serious downsides.

While the domain part is obvious, it’s worth mentioning that www. is not part of a domain. It’s merely a subdomain that’s commonly used by websites but is technically unnecessary. Many non-technical people think it’s needed, so whether you should use www.yourdomain.com or just yourdomain.com in your marketing or as the primary web address depends on your audience. Regardless, both addresses should get visitors to one and the same website.

Citing the Wikipedia page on URLs: “In computing, a Uniform Resource Locator (URL) is a Uniform Resource Identifier (URI) that specifies where an identified resource is available and the mechanism for retrieving it.” A hash-bang-based URL insufficiently specifies the mechanism for retrieving the content, as it requires a JavaScript round trip to the server after the server has already sent the browser an HTML page – a page that doesn’t have the content associated with the requested URL (yet).

This may all seem pedantic, but the significance becomes clear when you consider the reality of how resources are accessed. A browser loading a URL is obviously the most common way for a web page to get loaded, but it’s not the only method. Any simple wget- or curl-based attempt to pull in content from the web will no longer work, and any piece of software that loads web content now has to include a full JavaScript parser to support such URLs. And that’s all assuming the JavaScript doesn’t get filtered out by some proxy server or firewall, and doesn’t contain any errors anywhere in the page. When users turn JavaScript off in their browser, these sites will stop working.

If breaking the quiet agreement and having the entire site rely on fragile techniques isn’t bad enough, hash-bangs are also a one-way street to permanent maintenance and support. You can’t use server-side rewriting for your URLs, even when you redesign again. Thus, unless you want to break your incoming links and people’s bookmarks, you’ll always have to do some processing on your domain’s primary landing page to support these URLs once you’ve put them out there.

Some (mostly ancient) content management systems or blog engines identify each unique page with a long string of random characters; something like this: 5F0C866C-6DDF-4A9A-9515-531B0CA0C29C.html. If your content management system or site engine generates such URLs, find out how to overwrite or turn off that behaviour immediately; if that’s not possible, you really are better off getting a more modern CMS. There are only downsides to these URLs – for your users and yourself – and countless good, modern systems available to power your site that avoid this terrible technique.

Your URLs should be free of .php, .aspx and so forth. File extensions are not forward-compatible, so if you change backend systems and all your URLs contain .aspx you are forced to do server-side rewriting for every single page on your site. Costly, inefficient and completely unnecessary. The .html extension isn’t really recommended either, but if you’re confident you’ll only ever serve the pages you’re building as static files it’s an acceptable technique.

In a good, hackable URL, a human can adjust or remove parts of the path and get expected results from your site. They give your visitors better orientation around your pages, and enable them to easily move up levels. An example is: yourdomain.com/blog/2011/05/20/some-article. Reducing that to each forward slash should produce expectable results. For example, your domain.com/blog/2011/05/20/ should return all posts published 20 May 2011. yourdomain.com/blog/2011/05/ will give an overview of May 2011’s posts, while yourdomain.com/blog/2011/ could be used to bring up an overview of 2011’s posts, or, if that’s too granular, just post totals for each month. yourdomain.com/blog/ should return the latest updates, regardless of their actual publication date.

Because URLs are such an important part of your website or application, they ought to be among the first things you plan and work out with your team. Not just because you don’t want to have to change them over time, but because creating a great structure up front significantly helps with understanding and crystallising your user’s needs and requirements, as well as your own business requirements.

Once you have your URL structure, you can quickly and easily plot out a complete site map. This helps information architects design a great hierarchy and navigation, back-end engineers work efficiently and front- end developers turn the scope of sections and pages into clean markup and code. From the conceptual design phase onwards, a great URL structure that’s designed up front and collaboratively will help to make your web product better in every way.

Thank you for reading 5 articles this month* Join now for unlimited access

Enjoy your first month for just £1 / $1 / €1

*Read 5 free articles per month without a subscription

Join now for unlimited access

Try first month for just £1 / $1 / €1

TOPICS

The Creative Bloq team is made up of a group of art and design enthusiasts, and has changed and evolved since Creative Bloq began back in 2012. The current website team consists of eight full-time members of staff: Editor Georgia Coggan, Deputy Editor Rosie Hilder, Ecommerce Editor Beren Neale, Senior News Editor Daniel Piper, Editor, Digital Art and 3D Ian Dean, Tech Reviews Editor Erlingur Einarsson, Ecommerce Writer Beth Nicholls and Staff Writer Natalie Fear, as well as a roster of freelancers from around the world. The ImagineFX magazine team also pitch in, ensuring that content from leading digital art publication ImagineFX is represented on Creative Bloq.

Recommended reading

Design the perfect URL

What is a URL?

Domain

Path

Query strings

Fragment identifiers

Breaking the agreement

Bad practices

Page identification hashes

Session hashes

File extensions

Non-ASCII characters

Underscores

Keyword stuffing

Good practices

Robust URL mapping

Hackable URLs

Namespaces

The business case

Please wait...