More than you ever wanted to know about project-wide reference IDs.

The Urubu documentation discusses project-wide reference IDs in at least three different places. This page tries to merge the information together, rephrasing it so that it's easier to understand for a web development/markdown newbie such as myself. Towards the end of the page I cover what I originally thought was an Urubu bug, but turns out to be a simple case of unexpected name clashes.

Background

As I redid my website with Urubu's assistance, I got the idea of devoting a few new pages to the process itself. (You are currently reading those very pages.) I quickly discovered some issues related to using the string "urubu."

Two things I concluded were: you can't name a directory "urubu", and you can't name a page "urubu". Trying to do so leads to errors when you run make, and —if you do manage to get the website to build—it leads to your users' unintentionally navigating to the Urubu Home Page instead of to your own page named "Urubu."

I had work-arounds for these issues. Simply name your directories or pages something else—"urubumigration," for example, or even "urubux".

I knew that the cause of this issue had something to do with the Project-Wide Reference IDs discussed in the Urubu Manual. But, being nearly illiterate as a web developer, I couldn't make sense of what the documentation there was trying to tell me.

Eventually I found other places in the (copious and usually excellent) Urubu documentation that also cover reference IDs, and after some trial and error it has all become clearer to me. For myself, and for anyone else who may find it helpful, I provide my current understanding of what the documentation has to say about this topic.

I use the Urubu Quickstart project as a vehicle for my examples. You may find it helpful to refer to a local copy of the project as you read on.

Sources in the Documentation

I've found three places where the documentation covers the topic of reference IDs, both page- and project-wide.

  • The Reference IDs passage in the Quickstart project gives some bare-bones information on the topic.
  • The Reference Links section of the Authoring page of the Manual provides valuable background on why project-wide reference IDs were created in the first place. It also has a discussion of how to use anchors in reference links.
  • The Project-Wide Reference IDs section provides the most information on the topic, but presumes much background knowledge on the part of the reader.

References in Markdown

In the Creating Links section of my Markdown Tips page I describe the three ways of creating a hyperlink with markdown. The Reference Style defines the reference—that is, says what it links to—on a page, and then uses the reference elsewhere on the same page. Here's an example:

[Urubu Manual]: http://urubu.jandecaluwe.com/manual/

... as mentioned in the [Urubu Manual], 
which you can find on the [Urubu Home page], ...

[Urubu Home page]: http://urubu.jandecaluwe.com

You see that the "Urubu Manual" and "Urubu Home page" references are defined on the same page in which they are used. (Note also that where they are defined on the page is not important.) If we wanted to hyperlink to the Urubu Manual using the reference style approach on some other page, we would have to create exactly the same definition on that page as well.

Project-Wide Ref IDs

Urubu extends the reference style of linking with new functionality called project-wide reference IDs. You use a reference ID in the same way as a reference (with square brackets, as in [Urubu Manual]), but a project-wide reference ID can be defined in one place and used anywhere else in the project. Think of it as a global constant.

Ref ID Creation

Project-wide reference IDs are created in two ways:

  • Explicitly: At the bottom of Quickstart's _site.yml file you will find a "reflinks" section, which defines some external reference IDs. You could add your own reference IDs here. It would make sense, for example, if you found yourself creating links to the same Internet page over and over.

  • Implicitly: A reference ID is created automatically for every page and folder you add to your website.

It should be clear that links to external web pages would be created explicitly, while a link to an internal page on your own website would use an implicitly-created reference ID.

Ref ID Usage

You can use these global variables anywhere in your markdown file. That is, in both the "yaml front matter" at the top, and in the page content which follows the divider. (See Parts of a Markdown File.)

Ref ID Examples

The Urubu Quickstart project uses both external, explicitly-created reference IDs and internal, implicitly-created ones.

Explicitly-Created

The "Start" page offers an example of the usage of a project-wide reference ID explicitly defined in the _site.yml file.

_site.yml:

reflinks:
    urubu:
        url: http://urubu.jandecaluwe.com
        title: Urubu

start.md:

Urubu Quickstart is a companion site of [urubu], a tool to develop static websites.

Urubu reads these two sources as it generates the Quickstart website. When finished, the start.html page looks like this in your browser:

Urubu Quickstart is a companion site of Urubu, a tool to develop static websites.

See The Link Text section below to find out how our static website generator decided to make "Urubu" the link text in this example.

Implicitly-Created

Implicitly created on a page. In the project root, the Quickstart project has two files—customize.md and advanced.md. As you probably know by now, each of these .md files will cause a page to be generated for the website. As you also know, the title of the page is determined by the value of the title keyword in that markdown file's yaml front matter. For these two files, the titles are "Customize" and "Advanced features," respectively.

The Urubu build sees to it that a project-wide reference ID is automatically created for both of these pages. That is what this snippet from customize.md leverages.

For more info on customization possibilties, check the [advanced] page.

When this markdown page is generated, the page looks like this in your browser:

For more info on customization possibilties, check the Advanced features page.

Before reading on, take a minute to think about how cool this is. ... Done? Okay, let's move on to the next type of implicitly-generated project-wide fereference ID.

Implicitly created on a directory. The Quickstart project has a directory called "manual," so a reference ID is implicitly created on that string when Urubu generates the website. In content.md, the Quickstart project makes use of this reference ID, thus:

As an example, see [manual].

When this markdown page is converted to HTML, the page looks like this in your browser:

As an example, see Manual demo.

The use of the reference ID [manual] in the example above works by beginning at the location in the project of the page that uses the reference ID. This is the content.md page at the root of the project. Then Urubu looks for a folder or a page in the same directory answering to the reference ID of "manual." It finds that folder, and since the reference does not specify a page, Urubu links to the index page in that directory. The index.md file in that directory has its title set to "Manual demo," and that title is used for the link text.

Relative Paths

The examples above cover the situation where the page or directory being linked to is in the same directory as the page specifying the link.

To link to pages and folders not in the same location as the page using the reference ID, you have to specify the relative path of the page—that is, relative to the page where you want to link from. For example, if we had wanted to link to the intro.md page in the manual folder, our use of the reference ID would have had to look like this: [manual/intro].

This section restates some Urubu behavior which the examples above have already illustrated. In other words, it's a bit repetitive, but I think the information is worth emphasizing.

Take a look at the highlighted string "Manual demo" in the example above. That is called the link text.

The Inline Style of linking to other pages sets the link text to whatever is inside the square brackets.

Just go to [Google's home page](http://www.google.com) and google it.

The Reference Style also gets its link text from the reference, which is also inside the square brackets.

[Urubu Manual]: http://urubu.jandecaluwe.com/manual/

... as mentioned in the [Urubu Manual], ...

For Urubu's project-wide reference IDs, it's a little different.

As we've seen, in configuring the link text of a hyperlink created via a reference ID, Urubu uses the title keyword.

The different types of project-wide reference IDs have different ways of using the title keyword to create a hyperlink's link text.

  • Reference IDs explicitly created in the _site.yml file have a title keyword whose value is used to populate the so-called title of the link. We see this in the Explicitly Created example above, with the "Urubu" title.
  • Reference IDs automatically created for pages get their title from the title keyword from that page's "yaml front matter".
  • Reference IDs for directories comes from the yaml front matter title of that page's index file. We see this in the Implicitly Created examples above, where "Advanced features" and "Manual demo" appear as the link text.

One final note: there may be times when you do not want the link text of one of your internal hyperlinks to be that page's title. The solution to this requirement is to use the inline style of linking, which allows you to put anything at all inside the square brackets.

Linking to a Bookmark

Sometimes you want to be precise in your linking. That is, sometimes you want to link to a particlar part of a page, rather than to the top of a page. In case you haven't noticed, I do that all the time.

When you link to a particular part of a page, you are linking to what is called a bookmark, which is a type of destination anchor (as opposed to the source anchor of a link). Here we're going to talk about internal bookmarks, that is, linking to bookmarks on your own websites. Some of the techniques discussed apply to markdown generally, but others are Urubu-specific.

I cover linking to external bookmarks in a section of my Markdown Tips page called Fragments & Slugification. There are a couple of terms introduced there which you'll need to understand for this discussion.

  • A link fragment is the part of a link that specifies a bookmark on the destination web page. For example, let's consider this piece of markdown text: [Reference links](http://urubu.jandecaluwe.com/manual/authoring.html#reference-links). The link is everything inside the parentheses, and the fragment is the piece after the pound sign—reference-links.

  • A slugified URL is one whose pieces have been constructed or modified to make it more readable. Usually this involves eliminationg punctuation, replacing spaces with hyphens, and lowercasing what remains. For example, in https://stackoverflow.com/questions/5574042/string-slugification-in-python, the part after the last forward slash has been slugified.

  • A bookmark must have an HTML tag with an id attribute, and it is the id that you reference when you link to that specific part of a page.

Having dispensed with the preliminaries, I can finally get to the main point of this topic:

In Urubu, fragments can be appended to project-wide reference IDs to link to any header on your own pages. Conveniently, Urubu allows you to do this by using either the original, or the slugified, version of your section title.

We'll use the Urubu Quickstart project as an example. Let's suppose that, at the very top of the Start page, we want to link to the "Reference ids" section of the Add content page. Our markdown would look like this:

Here we're inserting a link to the [content#reference-ids] section of the *Add content* page.

Open start.md, paste into the top of the file the sentence above, run make, and in your browser navigate to the Start page. Here's what you'll see:

Here we're inserting a link to the reference-ids section of the Add content page.

"But wait," you say. "'reference-ids' is not the actual title—it's 'Reference ids.' Besides, 'reference-ids' makes it look like I'm a sloppy proofreader."

Anticipating your concern, Urubu let's you use the non-slugified version of the page in your markdown. When it generates your pages, it handles all the linking for you—and also provides you a more attractive link text.

In other words, write this:

Here we're inserting a link to the [content#Reference ids] section of the *Add content* page.

And after you build the website you'll see this:

Here we're inserting a link to the Reference ids section of the Add content page.

How does Urubu allow you to do this? It programmatically adds a slugified id attribute to all of a web site's headers, and converts fragments in markdown to their slugified versions as it builds the HTML. For example, the HTML for the "Reference ids" header looks like this:

<h2 id="reference-ids">Reference ids</h2>

The header above is the destination anchor of the link. The HTML for the link's source anchor is this:

<p>Here we're inserting a link to the <a href="/content.html#reference-ids" title="Add content">Reference ids</a> section of the <em>Add content</em> page.</p>

When you think about it, this is how the Table of Contents in the right-hand sidebar works. In fact, it's pratically the only way it could work.

A False Alarm

If you try to take one of these shortcuts that Urubu offers you to the next level, you may be disappointed. Suppose you decide to create a bookmark (aka a Destination Anchor) on one of your pages, and you decide you'll be really devilish and try to link to it with a project-wide reference ID.

  • As we've seen, you can use a reference ID ("content" in the example below) with a tag's id attribute (e.g. "reference-ids") to serve as the source anchor when the destination is a header:
Here we're inserting a link to the [content#reference-ids] section of the *Add content* page.
  • And if you read my documentation on destination anchors (which I linked to above) you know that it is complately kosher to write this as a destination anchor:
<p id="myDestination">This paragraph has been bookmarked.</p>
  • So it would be natural for you to assume that you could refer to the destination anchor above with the following:
Here we're inserting a link to the [content#myDestination] section of the *Add content* page.

Alas, you can't. Or rather, you can—as long as you're willing to put up with a false warning when you do the make.

(venv) $ make
python -m urubu build
.../python3.4/site-packages/urubu/project.py:436: 
UrubuWarning: in /urubumigration/markdowntips: 
Undefined anchor: '/urubumigration/approach#myDestination'
  urubu_warn(_warning.undef_anchor, msg=ar, fn=info['id'] )

I call it a "false warning" because the link still works.

To avoid the warning, use the following approach instead.

Here we're inserting a link to the [My Destination](./content.html#myDestination) section of the *Add content* page.

Finally, this warning isn't always false. See the next section below.

Renaming a Section Header

It might have crossed your mind, while reading the discussion above on linking to section headers, that you run a risk when you rename a section header. What if, somewhere else in your project, there's a link to that automatically-generated id? Wouldn't you then break the link?

Fortunately, Urubu prevents this from happening by throwing a warning in the build process. Here's a snippet of what you might see if you make this mistake.

.../venv/lib/python3.4/site-packages/urubu/project.py:436: 
UrubuWarning: in /urubumigration/projectwideids: 
Undefined anchor: '/urubumigration/projectwideids#ref-id-examples'
  urubu_warn(_warning.undef_anchor, msg=ar, fn=info['id'] )

Name Clashes

Or, Thou Shalt Not Use the Name of Thy Static Website Generator in Vain

As the documentation makes clear, Urubu puts both explicitly- and implicitly-created reference IDs in a single namespace. This means that you could have a name clash if you try to create a reference ID which has the same name of a reference ID that already exists.

Recall from the Ref Id Creation section above that the implicitly-created reference IDs are created on pages and folders in your project. Although two pages in two different directories could have the same name—as with two subfolders in two different folders—this will not cause a name clash, since Urubu resolves these according to the location of the page using the reference ID. (That is, it uses relative paths.) Name clashes will only occur when a page or folder has the same name as an explicitly-created reference ID.

Below I deal with both Name Clashes on a Page and Name Clashes on a Directory. In my experimentation, it seems that you can resolve the first type with relative paths, but that, for name clashes on a directory, you have to rename either the directory or the reference ID that it is clashing with.

Name Clashes on a Page

This is a fairly long example of how you might end up with a name clash if you give a page on your website the same name as a reference ID created in the reflinks section of _site.yml. It also provides a work-around for this, particularly the specific case where you want to name a page "Urubu".

In the Explicitly-Created reflinks section above we saw how the "urubu" reference ID is created in the _site.yml file of the. Let's intentionally create a name clash by creating a page that is also called "urubu". We'll do this in the manual directory, where we modify the index.md file thus:

---
title: Manual demo 
layout: index
content:
    - intro 
    - chapter_1 
    - chapter_2 
    - urubu
---

It doesn't make sense to run make yet, because we haven't defined our new page, but if you do try to build you'll see there is no error.

Why isn't there an error? Because the "urubu" reference id has already been defined. After running make, start your browser at localhost:8000 and navigate to the Manual page via the dropdown list on the right under More. On the Manual page you'll see a nice link labeled "Urubu." Click on it and you'll end up at the Urubu home page.

But let's go on to create our own "urubu" page. Create a new file in the manual directory, name it "manual.md," and paste these contents into it.

---
title: Urubu 
layout: page 
pager: true
---

On this page I'm going to write about Urubu.

Try to build now and this is what you see:

venv) $ make
python -m urubu build
Traceback (most recent call last):
  File "/usr/local/Cellar/python3/3.4.3/Frameworks/Python.framework/Versions/3.4/lib/python3.4/runpy.py", line 170, in _run_module_as_main
    "__main__", mod_spec)
[...]
   raise UrubuError(_error.ambig_ref, msg=ref, fn=indexfn)
urubu.UrubuError: in manual/index.md: Ambiguous reference, cannot resolve: 'urubu'
make: *** [build] Error 1

In the Project-Wide Reference IDs section of the Urubu Manual, we find a work-around for a name clash: "In case of a name clash with a global reference id, you will have to disambiguate by adding pathname components."

Let's try this work-around in our revised manual/index.md file:

---
title: Manual demo 
layout: index
content:
    - intro 
    - chapter_1 
    - chapter_2 
    - ./urubu
---

Now the project builds. Even better, when you click on the "Urubu link on our new manual/index.html file, it takes you to our new "Urubu" page.

Name Clashes on a Directory

In the Name Clashes on a Page example, above, I showed how you can use relative paths to disambiguate a page from a reference ID that has the same name.

Here we cover the similar, but slightly different situation in which you want a directory to have the same name as a reference ID. For our example, we'll again create a name clash on the "urubu" reference ID created in _site.yml, but for the case where the clash occurs with a directory name rather than a page name.

For pages, we found we could have our cake and eat it, too. In other words, we figured out a way to keep "urubu" as a name for a page. Alas, I have not found such a solution to the issue of naming a directory "urubu." It would seem that the two possible work-arounds are either:

  • Don't name a directory "urubu." Give it some other name, like "urubumigration" or "urubux."
  • Alternatively, remove from your website any project-wide references to the global reference ID "urubu" which is defined in _site.yml. If you choose this option, you should probably remove or at least comment out the _site.yml definition.

Here's the situation that brings me to this conclusion.

Your Quickstart project has a directory structure like the following. The original project is unchanged except for a new directory called "urubu".

alt text

Here's the index.md file that lists the new directory.

title: Urubu Quickstart
layout: home 
content:
    - start
    - content
    - customize
    - deploy
    - more 
    - urubu
tagline:
    Set up your new Urubu project quickly

And finally, here's a snippet of the error message this setup produces when I try to build it.

(venv) $ make
python -m urubu build
Traceback (most recent call last):
  File "/usr/local/Cellar/python3/3.4.3/Frameworks/Python.framework/Versions/3.4/lib/python3.4/runpy.py", line 170, in _run_module_as_main
    "__main__", mod_spec)
[...]
    raise UrubuError(_error.ambig_ref, msg=ref, fn=indexfn)
urubu.UrubuError: in index.md: Ambiguous reference, cannot resolve: 'urubu'
make: *** [build] Error 1

As we've seen, the Urubu documentation says that the solution here is to "disambiguate by adding pathname components".

Sure enough, we can get the make to get past this error by changing "- urubu" in the index.md file to "- ./urubu". The problem is that this still won't build. The new error looks like this:

(venv) $ make
python -m urubu build
Traceback (most recent call last):
  File "/usr/local/Cellar/python3/3.4.3/Frameworks/Python.framework/Versions/3.4/lib/python3.4/runpy.py", line 170, in _run_module_as_main
    "__main__", mod_spec)
[...]
    raise UrubuError(_error.ambig_ref_md, msg=ref, fn=this['fn'])
urubu.UrubuError: in start.md: Ambiguous reference: 'urubu'
make: *** [build] Error 1

"What is happening in start.md?" you may ask yourself. This is what's happening there:

Urubu Quickstart is a companion site of [urubu], a tool to develop static
websites.

As noted in Ref ID Examples above, the square brackets tell the website generator that the word "urubu" is a global reference to some URL. The generator has created two reference IDs for the word "urubu"—one defined in the _site.yml file, and one created by your giving a directory that name. So in the build, when it tries to generate the start.html page and comes to the reference inside the square brackets, Urubu doesn't know which one to create the hyperlink for.

So let's eliminate the creation of the reference iD in the _site.yml file:

reflinks:
#    urubu:
#        url: http://urubu.jandecaluwe.com
#        title: Urubu
    urubu_manual:
        url: http://urubu.jandecaluwe.com/manual
        title: Urubu Manual

Because there is no name clash, we can now build without error.

There's only one problem remaining: we left the reference to the "urubu" link object in the start.md file—[urubu]. If we inspect the generated HTML in _build/start.html, we can see that the reference now hyperlinks to our Urubu directory:

 <p class="lead">Urubu Quickstart is a companion site of <a href="/urubu/" title="Urubu">Urubu</a>, a tool to develop static
websites.  You can use it set up a new Urubu project quickly.</p>

In our browser we can go to "localhost:8000," which brings us to the index.html page under _build. There we can click on the Start button, which brings us to _build/start.html. There we see the sentence "Urubu Quickstart is a companion site of Urubu, a tool to develop static websites," with the word "Urubu" as a link. And, finally, we can click on that link, and the browser takes us to ... the index page of our newly-created "urubu" directory.

So in start.md, let's eliminate the reference to the project-wide ID, which now points to our directory, replacing it with a good old-fashioned markdown link.

Urubu Quickstart is a companion site of [urubu](http://urubu.jandecaluwe.com), a tool to develop static
websites.

Now we build without error, and all our links are correct.