

Adding search to a middleman blog
source link: https://willschenk.com/articles/2015/adding-search-to-a-middleman-blog/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Published June 7, 2015 #middleman #howto #ruby
We’re going to build a simple, niave search for middleman blogs. We’re going to build a search index at build time, and then use that index to perform the search itself on the client side.
Building the index
When you typed in something in google, it doesn’t then go and hit every page on the internet to check to see if there’s a match. It doesn’t even look at every page that it has squirreled away somewhere in the googleplex. What it consults is an index of the documents out there, and the index points to the page information. (We all know that it’s a lot more complicated than that really, but run with it.)
First thing we’re going to do is create a very simple version of this index for your site. This is going to be in a file called source/article.index.json.erb
.
- Go through all of the articles.
- Add meta data for the article into the master map.
- Find all of the words in the article, by stripping out all of the html tags, making things lowercase, and breaking it apart by white space.
- Insert all of those words into our index.
- Convert the whole sucker to JSON.
<%
map = {articles:{}}
index = {}
blog.articles.each do |article|
map[:articles][article.url] = {
title: article.title,
date: article.date,
tags: article.tags
}
words = "#{article.title} #{article.body}"
words = words
.downcase # make lowercase
.gsub( /<.*?>/, "" ) # get rid of tags
.gsub( /[^\w ]/, "" ) # get rid of not letters
.split( /\s+/ ) # split by words
.sort.uniq
words.each do |w|
index[w] ||= []
index[w] << article.url
end
end
map[:index] = index
%>
<%= map.to_json %>
Now lets add some markup to the blog
I’m sticking this in the header, as you see above:
%form.navbar-form.navbar-right#search{ role: "search" }
.form-group.dropdown
.input-group
%input.form-control#search_box{ type: "text", placeholder: "Search", autocomplete: "off" }
%span.input-group-btn
%button.btn.btn-default
%span.glyphicon.glyphicon-search
%ul.dropdown-menu.dropdown-menu-left.results
= link_to "Title", "/url"
Loading and Querying the index
Ok, lets build this from the ground up. All this goes into application.js
.First we’re create a method that loads up the index if we need it. We’re going to use a promise here, so if multiple request come in at the same time only one will go to the server:
var article_index = function() {
if( window.article_promise === undefined ) {
console.log( "Loading article index" );
window.article_promise = $.Deferred();
$.get( "/articles.index.json" ).done( function(data) {
console.log( "Loaded article index" );
window.article_promise.resolve( data );
} );
}
return window.article_promise;
}
This is called like article_index().done( function( index ) { console.log( index )})
The second time it calls, it returns the first promise again so everything is nice and fast.
To query the index itself we need to look through all of the words and return a list of urls that match:
var match_index = function( word, index ) {
var result = [];
var matcher = new RegExp( word );
$.each( index.index, function( i ) {
if( matcher.exec( i ) ) {
$.each( index.index[i], function( idx, url ) {
if( result.indexOf( url ) < 0 ) {
result.push( url );
}
} );
}
});
return result;
}
Now lets build a simple search. This is a little complicated, since we need to compute the intersection of the results if the user types in multiple words. Here’s what’s happening:
- We create a promise, since we may need wait for the index to load.
- We split the search term into multiple words.
- Collect the results of the
match_index
function. - Compute the intersections of all the results
- Look up the meta data based on the url.
- Resolve the promise with the results.
var find_article = function( search ) {
var search_results = $.Deferred();
article_index().done( function( index ) {
// Split the search by widespace
var words = search.toLowerCase().replace(/\s+$/, '').split( /\s+/ );
// Lookup the matches for each word
// Note using $.map seems to flatten the result.
var full_results = [];
$.each( words, function( i, word ) {
full_results.push( match_index( word, index ) );
} );
var urls = full_results[0];
// If there are multiple words, compute the intersection
if( full_results.length > 1 ) {
var url_counts = {}
$.each( full_results, function( i, set ) {
$.each( set, function( i, url ) {
url_counts[url] = (url_counts[url]||0)+1;
} )
} );
urls = [];
$.each( url_counts, function( url, count ) {
if( count == full_results.length ) {
urls.push( url );
}
} );
}
// Pull in the metadata
var results = {};
$.each( urls, function( i, url ) {
results[url] = index.articles[url];
} );
search_results.resolve( results );
});
return search_results;
};
Wiring it up
First we need to call our code when the user inputs something in the text area:
$(function() {
$( "#search_box" ).on( 'input', search_typing );
});
Then we wire everything together:
- If the field is empty, hide the dropdown.
- Otherwise show the dropdown and a loading message
- Call
find_article
and when it returns - Put the results in the result dropdown.
var search_typing = function() {
var val = $(this).val();
if( val == "" ) {
$("#search .dropdown").removeClass( "open" );
} else {
$("#search .dropdown").addClass( "open" );
$("#search .dropdown-menu").html("<li class='disabled'><a href='#'>Loading " + val +"</a></li>")
find_article( val ).done( function(results) {
var menu = $("#search .dropdown-menu").html("");
if( $.isEmptyObject( results ) ) {
$("#search .dropdown-menu").html("<li class='disabled'><a href='#'>No results for: " + val +"</a></li>")
}
$.each( results, function( url, info ) {
menu.append( "<li><a href=\"" + url + "\">" + info.title + "</a></li>");
});
});
}
}
Next steps
- Language stemming
- Logical operations
- Showing more metadata in the search results.
Recommend
-
43
Google is announcing changes to its search product today, tied in part to the 20th anniversary of the company. The biggest announcement is that Google is rebranding its news feed as "Discover." It will be on the Google homepage on all mobile bro...
-
12
the site will still be fast, really fastPublished July 30, 2015 #middleman
-
6
This article was published a while ago and may contain obsolete information!As part of the process of getting this site to work, I learned some more things about how to better build a site with middleman. Building off of our
-
7
lean publishingPublished November 25, 2014 #middleman #r...
-
10
Adding a Google search box to the front page @ Things Of Interest Adding a Google search box to the front page 2005-08-11 by qntm
-
14
Adding Search to your Eleventy Static Site with LunrI recently came back from connect.tech (one of my favorite conferences). I had the honor of giving not one, but two different talks. On...
-
6
Using PDFs with the Jamstack - Adding Search with Text ExtractionA few weeks ago I shared a couple of blog posts describing how to add PDFs to your Jamstack site. The
-
7
Cutting Out the Middleman: How the Blockchain is Disrupting the Way We Connect With BusinessesOctober 1st 2021 new story
-
9
How Zola Jesus Cut Out the Middleman to Engage With Her Fans DirectlyOctober 1, 2019Muriel VegaMusician Nika Danilova, better known as
-
10
The Pros and Cons of Our “Middleman Economy” You have 1 free article left this month.
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK