The easier way to use lunr search with Hugo

It might not be immediately obvious but my blog is a collection of static pages, generated by Hugo static site generator and updated automatically whenever I push to the GitHub repository. Back when I started using it, I had to decide on a search solution. I ruled out a third-party service (because privacy) and a server-supported one (because security). Instead, I went with lunr.js which works entirely on the client side.

Now if you want to do the same, you better don’t waste your time on the solution currently proposed by the Hugo documentation. It relies on updating the search index manually using an external tool whenever you update the content. And that tool will often deduce page addresses incorrectly, only some Hugo configurations are supported.

Eventually I realized that Hugo is perfectly capable of generating a search index by itself. I recently contributed the necessary code to the MemE theme, so by using this theme you get search capability “for free.” But in case you don’t want to switch to a new theme right now, I’ll walk you through the necessary changes.

Generating the search index

Hugo can generate the search index the same way it generates RSS feeds for example, it’s just another output format. You merely need to add a template for it, e.g. layouts/index.searchindex.json:

[
  {{- range $index, $page := .Site.RegularPages -}}
    {{- if gt $index 0 -}} , {{- end -}}
    {{- $entry := dict "uri" $page.RelPermalink "title" $page.Title -}}
    {{- $entry = merge $entry (dict "content" ($page.Plain | htmlUnescape)) -}}
    {{- $entry = merge $entry (dict "description" $page.Description) -}}
    {{- $entry = merge $entry (dict "categories" $page.Params.categories) -}}
    {{- $entry | jsonify -}}
  {{- end -}}
]

This will generate a JSON file containing a list of all pages. A page entry contains its address, title, contents, description and categories. You can easily add more fields if you want them to be searchable, for example tags.

Now you have to make sure the search index is actually generated, the output format needs to be added to the site’s configuration. Here assuming YAML-formatted configuration and default existing outputs for the home page:

outputFormats:
  SearchIndex:
    baseName: search
    mediaType: application/json

outputs:
  home:
    - HTML
    - RSS
    - SearchIndex

After rebuilding the website you should have a search.json file in the root directory. It’s not going to be tiny, but with gzip compression enabled the download size should be acceptable for most websites.

Adding the necessary elements

Now you need a search form on your page. For me it looks like this:

<form id="search" class="search" role="search">
  <label for="search-input">
    <svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 512 512" class="icon search-icon"><path d="M505 442.7L405.3 343c-4.5-4.5-10.6-7-17-7H372c27.6-35.3 44-79.7 44-128C416 93.1 322.9 0 208 0S0 93.1 0 208s93.1 208 208 208c48.3 0 92.7-16.4 128-44v16.3c0 6.4 2.5 12.5 7 17l99.7 99.7c9.4 9.4 24.6 9.4 33.9 0l28.3-28.3c9.4-9.4 9.4-24.6.1-34zM208 336c-70.7 0-128-57.2-128-128 0-70.7 57.2-128 128-128 70.7 0 128 57.2 128 128 0 70.7-57.2 128-128 128z"/></svg>
  </label>
  <input type="search" id="search-input" class="search-input">
</form>

That’s an SVG icon from Font Awesome being used as search label. I style this form in such a way that the text field only occupies space when it is focused. In addition, there is an animation to make the icon spin when a search operation is in progress:

@keyframes spin {
  100% {
    transform: rotateY(360deg);
  }
}

.search {
  display: flex;
  justify-content: center;
  border: 1px solid black;
  min-width: 1em;
  height: 1em;
  line-height: 1;
  border-radius: 0.75em;
  padding: 0.25em;
}

.search-icon {
  color: black;
  cursor: pointer;
  width: 1em;
  height: 1em;
  margin: 0;
  vertical-align: bottom;
}

.search[data-running] .search-icon {
  animation: spin 1.5s linear infinite;
}

.search-input {
  border-width: 0;
  padding: 0;
  margin: 0;
  width: 0;
  outline: none;
  background: transparent;
  transition: width 0.5s;
}

.search-input:focus {
  margin-left: 0.5em;
  width: 10em;
}

Finally, we need a template for search results. This element is hidden but will be cloned and filled with data for any page found by the search. Mine looks like this:

<template id="search-result" hidden>
  <article class="content post">
    <h2 class="post-title"><a class="summary-title-link"></a></h2>
    <summary class="summary"></summary>
    <div class="read-more-container">
      <a class="read-more-link">Read More »</a>
    </div>
  </article>
</template>

The JavaScript code

And then you need some JavaScript code to make all of this work. Obviously, you will need lunr.js script itself. If you have non-English texts on your websites, you will also need lunr.stemmer.support.js and the right language pack from the lunr-languages package. And some code to connect all of this to the search field. In order to conserve bandwidth, my code only loads the search index when it is needed – the first time a search is performed.

window.addEventListener("DOMContentLoaded", function(event)
{
  var index = null;
  var lookup = null;
  var queuedTerm = null;

  var form = document.getElementById("search");
  var input = document.getElementById("search-input");

  form.addEventListener("submit", function(event)
  {
    event.preventDefault();

    var term = input.value.trim();
    if (!term)
      return;

    startSearch(term);
  }, false);

  function startSearch(term)
  {
    // Start icon animation.
    form.setAttribute("data-running", "true");

    if (index)
    {
      // Index already present, search directly.
      search(term);
    }
    else if (queuedTerm)
    {
      // Index is being loaded, replace the term we want to search for.
      queuedTerm = term;
    }
    else
    {
      // Start loading index, perform the search when done.
      queuedTerm = term;
      initIndex();
    }
  }

  function searchDone()
  {
    // Stop icon animation.
    form.removeAttribute("data-running");

    queuedTerm = null;
  }

  function initIndex()
  {
    var request = new XMLHttpRequest();
    request.open("GET", "/search.json");
    request.responseType = "json";
    request.addEventListener("load", function(event)
    {
      lookup = {};
      index = lunr(function()
      {
        // Uncomment the following line and replace de by the right language
        // code to use a lunr language pack.

        // this.use(lunr.de);

        this.ref("uri");

        // If you added more searchable fields to the search index, list them here.
        this.field("title");
        this.field("content");
        this.field("description");
        this.field("categories");

        for (var doc of request.response)
        {
          this.add(doc);
          lookup[doc.uri] = doc;
        }
      });

      // Search index is ready, perform the search now
      search(queuedTerm);
    }, false);
    request.addEventListener("error", searchDone, false);
    request.send(null);
  }

  function search(term)
  {
    var results = index.search(term);

    // The element where search results should be displayed, adjust as needed.
    var target = document.querySelector(".main-inner");

    while (target.firstChild)
      target.removeChild(target.firstChild);

    var title = document.createElement("h1");
    title.id = "search-results";
    title.className = "list-title";

    if (results.length == 0)
      title.textContent = `No results found for “${term}”`;
    else if (results.length == 1)
      title.textContent = `Found one result for “${term}”`;
    else
      title.textContent = `Found ${results.length} results for “${term}”`;
    target.appendChild(title);
    document.title = title.textContent;

    var template = document.getElementById("search-result");
    for (var result of results)
    {
      var doc = lookup[result.ref];

      // Fill out search result template, adjust as needed.
      var element = template.content.cloneNode(true);
      element.querySelector(".summary-title-link").href =
          element.querySelector(".read-more-link").href = doc.uri;
      element.querySelector(".summary-title-link").textContent = doc.title;
      element.querySelector(".summary").textContent = truncate(doc.content, 70);
      target.appendChild(element);
    }
    title.scrollIntoView(true);

    searchDone();
  }

  // This matches Hugo's own summary logic:
  // https://github.com/gohugoio/hugo/blob/b5f39d23b8/helpers/content.go#L543
  function truncate(text, minWords)
  {
    var match;
    var result = "";
    var wordCount = 0;
    var regexp = /(\S+)(\s*)/g;
    while (match = regexp.exec(text))
    {
      wordCount++;
      if (wordCount <= minWords)
        result += match[0];
      else
      {
        var char1 = match[1][match[1].length - 1];
        var char2 = match[2][0];
        if (/[.?!"]/.test(char1) || char2 == "\n")
        {
          result += match[1];
          break;
        }
        else
          result += match[0];
      }
    }
    return result;
  }
}, false);

This glue code might require a few changes depending on your setup. You need to adjust initIndex() function if you use a non-English language (uncomment this.use() call) or have additional fields in your search index. You also need to adjust search() function if your search result template is different from mine listed above or if you want the search title have a different class name.

The complete code

The code given above has been mildly simplified, the actual code used by the MemE theme considers a bunch more scenarios. If you want to take a look at the “real thing,” here it is:

The easier way to use lunr search with Hugo

The easier way to use lunr search with Hugo

Generating the search index

Adding the necessary elements

The JavaScript code

The complete code

Recommend

Does Signal’s “secure value recovery” really work?

Dismantling BullGuard Antivirus’ online protection

A grim outlook on the future of browser add-ons

Added Webmention support to the blog

What would you risk for free Honey?

How anti-fingerprinting extensions tend to make fingerprinting easier

Emacs Smart Split

求职时的常见错误

Clojure: 现实世界的 LISP

传统媒体和互联网

About Joyk