6.3 KiB
layout | title | date | permalink | categories | author |
---|---|---|---|---|---|
post | Autocomplete using Tries | 2015-07-24 09:44:00 | autocomplete-predict-trie/ | programming | Mahdi |
In this article, I'm going over creating an autocompletion/prediction system using a data-structure called Trie, it's fast and easy to customize.
Trie
Trie is a simple data-structure most commonly used as a dictionary, it looks like so:
As you see, it's just a tree, a set of nodes connected to other [child] nodes, but the nodes have a special relationship:
Each child node extends it's parent with one extra character.
{% highlight javascript %} // Something like this child.value = parent.value + 'c'; {% endhighlight %}
It's pretty easy to traverse this tree and predict the next possible words.
Implementation
We're going to use ES6 classes to create our Trie
and Node
classes.
Let's start with our simple Node class: {% highlight javascript %} class Node { constructor(value = '') { this.value = value; this.children = []; } } {% endhighlight %}
Unlike binary trees where each node has a left and right child, Trie nodes don't necessarily have a limit on how many children they can have.
Trie class: {% highlight javascript %} class Trie { constructor() { this.root = new Node(); }
add(value, parent = this.root) { for (let i = 0, len = value.length; i < len; i++) { let node = parent.children.find(child => child.value[i] === value[i]);
if (!node) {
node = new Node(value.slice(0, i + 1));
parent.children.push(node);
}
parent = node;
}
return parent;
}
find(value, parent = this.root) { for (let i = 0, len = value.length; i < len; i++) { parent = parent.children.find(child => child.value[i] === value[i]);
if (!parent) return null;
}
return parent;
} } {% endhighlight %} Every Trie must have a root node with empty value, that's how our single-character nodes follow the rule of Tries.
Ok, our first method, add
handles adding a value to the trie, creating
necessary parent nodes for our value. At each iteration, we compare the i
th
character of our value, with i
th character of current node's children's value,
if we find one, we continue to search the next branch, else, we create a node
with value.slice(0, i + 1)
and move onto the created node.
It might be a little hard to grasp at first, so I created a visualization of this method to help you understand it easier, take a look: Trie Visualization
Then we have our find method, which searches for the given value in the trie. The algorithm for searching is the same, comparing by index and moving to the next branch.
Example
That's it for our simple Trie class, now let's create an actual input with autocomplete functionality using our Trie.
{% highlight html %}
I put some random names and stuff into three categories, results: data.json
Now we have to create a Trie of our data: {% highlight javascript %} const trie = new Trie();
let data = {...}; // read from data.json
for (let category in data) { for (let item of data[category]) { let node = trie.add(item); node.category = category; } } {% endhighlight %} As simple as that, our trie is made, it looks like this: Data
Now, let's actually show results: {% highlight javascript %} const input = document.querySelector('input'); const results = document.querySelector('#results');
input.addEventListener('keyup', () => { results.innerHTML = '';
const nodes = trie.find(input.value);
if (!nodes) return;
for (let node of nodes.children) {
const category = node.category ? - ${node.category}
: '';
results.innerHTML += `<li>${node.value} ${category}</li>`;
} }); {% endhighlight %} Autocomplete 1 This will only show the instant-childs of the word entered, but that's not what we want, we want to show complete words, how do we do that?
First, we need a way to detect complete words, we can have a flag to recognize
complete words, we can modify our add
method to automatically flag whole words
or we can manually add the flag after adding the node, as we did by setting a
category for our words, so we already have a flag to recognize whole words,
that's our category
property, now let's add a new method to our Trie class to
find whole words.
{% highlight javascript %} ... findWords(value, parent = this.root) { let top = this.find(value, parent); if (!top) return [];
let words = [];
top.children.forEach(function getWords(node) {
if (node.category) words.push(node);
node.children.forEach(getWords);
});
return words;
} ... {% endhighlight %}
And change our event listener like so: {% highlight javascript %} const input = document.querySelector('input'); const results = document.querySelector('#results');
input.addEventListener('keyup', () => { results.innerHTML = '';
const nodes = trie.findWords(input.value); // << Change
if (!nodes.length) return; // << Change
for (let node of nodes) { // << Change
const category = node.category ? - ${node.category}
: '';
results.innerHTML += `<li>${node.value} ${category}</li>`;
} }); {% endhighlight %} Autocomplete 2 Ta-daa!
We have our autocomplete working! Let's add zsh-like-tab-to-next-char functionality.
{% highlight javascript %} input.addEventListener('keydown', e => { // Tab Key if (e.keyCode === 9) { e.preventDefault(); const current = trie.find(input.value);
if (!current.children.length) return;
input.value = current.children[0].value;
} }); {% endhighlight %}
That's it! We have an input with autocomplete and tab-to-next-char. Isn't it awesome?
Pst! I have a repository of algorithm implementations in ES6, you might want to take a look! mdibaiee/harmony-algorithms