theread.me/site/autocomplete-predict-trie/index.html

299 lines
24 KiB
HTML
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1">
<title>Autocomplete using Tries</title>
<meta name="description" content="In this article, Im going over creating an autocompletion/prediction system using a data-structure called Trie, its fast and easy to customize.">
<link href="https://fonts.googleapis.com/css?family=Secular+One|Nunito|Mononoki" rel="stylesheet">
<link rel="stylesheet" href="/css/main.css">
<link rel="canonical" href="http://localhost:4000/autocomplete-predict-trie/">
<link rel="alternate" type="application/rss+xml" title="mahdi" href="http://localhost:4000/feed.xml" />
<!--<script src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML" type="text/javascript"></script>-->
<script>
var channel = new BroadcastChannel('egg');
channel.addEventListener('message', message => {
alert('Got a message from the other tab:\n' + message.data);
});
</script>
</head>
<body>
<header class="site-header">
<h1>
<a class='site-title' href='/'>
mahdi
</a>
</h1>
<nav>
<p>
<a href="/snippets">snippets</a>
<a href="/art">pictures</a>
</p>
<!--<p class='categories'>-->
<!---->
<!---->
<!--<a href="">art</a>-->
<!---->
<!---->
<!---->
<!---->
<!--</p>-->
<p>
<a href='mailto:mdibaiee@pm.me'>email</a>
<a href='https://git.mahdi.blog/mahdi'>git</a>
<a href='https://www.librarything.com/profile/mdibaiee'>librarything</a>
<a href="http://localhost:4000/feed.xml">feed</a>
</p>
</nav>
</header>
<div class="page-content">
<div class="wrapper">
<h1 class="page-heading"></h1>
<div class='post lang-en'>
<div class="post-header">
<h1 class="post-title"><p>Autocomplete using Tries</p>
</h1>
<p class="post-meta">
<span>Jul 24, 2015</span>
<span>Reading time: 8 minutes</span>
</p>
</div>
<article class="post-content">
<p>In this article, Im going over creating an autocompletion/prediction system using a data-structure called Trie, its fast and easy to customize.</p>
<h1 id="trie">Trie</h1>
<p><a href="https://en.wikipedia.org/wiki/Trie">Trie</a> is a simple data-structure most commonly used as a dictionary, it looks like so:</p>
<p><img src="/img/trie.jpg" alt="Trie" /></p>
<p>As you see, its just a <em>tree</em>, a set of nodes connected to other [child] nodes, but the nodes have a special relationship:</p>
<p>Each child node extends its parent with one extra character.</p>
<figure class="highlight"><pre><code class="language-javascript" data-lang="javascript"><span class="c1">// Something like this</span>
<span class="nx">child</span><span class="p">.</span><span class="nx">value</span> <span class="o">=</span> <span class="nx">parent</span><span class="p">.</span><span class="nx">value</span> <span class="o">+</span> <span class="dl">'</span><span class="s1">c</span><span class="dl">'</span><span class="p">;</span></code></pre></figure>
<p>Its pretty easy to traverse this tree and predict the next possible words.</p>
<h2 id="implementation">Implementation</h2>
<p>Were going to use ES6 classes to create our <code class="language-plaintext highlighter-rouge">Trie</code> and <code class="language-plaintext highlighter-rouge">Node</code> classes.</p>
<p>Lets start with our simple Node class:</p>
<figure class="highlight"><pre><code class="language-javascript" data-lang="javascript"><span class="kd">class</span> <span class="nx">Node</span> <span class="p">{</span>
<span class="kd">constructor</span><span class="p">(</span><span class="nx">value</span> <span class="o">=</span> <span class="dl">''</span><span class="p">)</span> <span class="p">{</span>
<span class="k">this</span><span class="p">.</span><span class="nx">value</span> <span class="o">=</span> <span class="nx">value</span><span class="p">;</span>
<span class="k">this</span><span class="p">.</span><span class="nx">children</span> <span class="o">=</span> <span class="p">[];</span>
<span class="p">}</span>
<span class="p">}</span></code></pre></figure>
<p>Unlike <a href="https://en.wikipedia.org/wiki/Binary_tree">binary trees</a> where each node has a left and right child, Trie nodes dont necessarily have a limit on how many children they can have.</p>
<p>Trie class:</p>
<figure class="highlight"><pre><code class="language-javascript" data-lang="javascript"><span class="kd">class</span> <span class="nx">Trie</span> <span class="p">{</span>
<span class="kd">constructor</span><span class="p">()</span> <span class="p">{</span>
<span class="k">this</span><span class="p">.</span><span class="nx">root</span> <span class="o">=</span> <span class="k">new</span> <span class="nx">Node</span><span class="p">();</span>
<span class="p">}</span>
<span class="nx">add</span><span class="p">(</span><span class="nx">value</span><span class="p">,</span> <span class="nx">parent</span> <span class="o">=</span> <span class="k">this</span><span class="p">.</span><span class="nx">root</span><span class="p">)</span> <span class="p">{</span>
<span class="k">for</span> <span class="p">(</span><span class="kd">let</span> <span class="nx">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">,</span> <span class="nx">len</span> <span class="o">=</span> <span class="nx">value</span><span class="p">.</span><span class="nx">length</span><span class="p">;</span> <span class="nx">i</span> <span class="o">&lt;</span> <span class="nx">len</span><span class="p">;</span> <span class="nx">i</span><span class="o">++</span><span class="p">)</span> <span class="p">{</span>
<span class="kd">let</span> <span class="nx">node</span> <span class="o">=</span> <span class="nx">parent</span><span class="p">.</span><span class="nx">children</span><span class="p">.</span><span class="nx">find</span><span class="p">(</span><span class="nx">child</span> <span class="o">=&gt;</span> <span class="nx">child</span><span class="p">.</span><span class="nx">value</span><span class="p">[</span><span class="nx">i</span><span class="p">]</span> <span class="o">===</span> <span class="nx">value</span><span class="p">[</span><span class="nx">i</span><span class="p">]);</span>
<span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="nx">node</span><span class="p">)</span> <span class="p">{</span>
<span class="nx">node</span> <span class="o">=</span> <span class="k">new</span> <span class="nx">Node</span><span class="p">(</span><span class="nx">value</span><span class="p">.</span><span class="nx">slice</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="nx">i</span> <span class="o">+</span> <span class="mi">1</span><span class="p">));</span>
<span class="nx">parent</span><span class="p">.</span><span class="nx">children</span><span class="p">.</span><span class="nx">push</span><span class="p">(</span><span class="nx">node</span><span class="p">);</span>
<span class="p">}</span>
<span class="nx">parent</span> <span class="o">=</span> <span class="nx">node</span><span class="p">;</span>
<span class="p">}</span>
<span class="k">return</span> <span class="nx">parent</span><span class="p">;</span>
<span class="p">}</span>
<span class="nx">find</span><span class="p">(</span><span class="nx">value</span><span class="p">,</span> <span class="nx">parent</span> <span class="o">=</span> <span class="k">this</span><span class="p">.</span><span class="nx">root</span><span class="p">)</span> <span class="p">{</span>
<span class="k">for</span> <span class="p">(</span><span class="kd">let</span> <span class="nx">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">,</span> <span class="nx">len</span> <span class="o">=</span> <span class="nx">value</span><span class="p">.</span><span class="nx">length</span><span class="p">;</span> <span class="nx">i</span> <span class="o">&lt;</span> <span class="nx">len</span><span class="p">;</span> <span class="nx">i</span><span class="o">++</span><span class="p">)</span> <span class="p">{</span>
<span class="nx">parent</span> <span class="o">=</span> <span class="nx">parent</span><span class="p">.</span><span class="nx">children</span><span class="p">.</span><span class="nx">find</span><span class="p">(</span><span class="nx">child</span> <span class="o">=&gt;</span> <span class="nx">child</span><span class="p">.</span><span class="nx">value</span><span class="p">[</span><span class="nx">i</span><span class="p">]</span> <span class="o">===</span> <span class="nx">value</span><span class="p">[</span><span class="nx">i</span><span class="p">]);</span>
<span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="nx">parent</span><span class="p">)</span> <span class="k">return</span> <span class="kc">null</span><span class="p">;</span>
<span class="p">}</span>
<span class="k">return</span> <span class="nx">parent</span><span class="p">;</span>
<span class="p">}</span>
<span class="p">}</span></code></pre></figure>
<p>Every Trie must have a root node with empty value, thats how our single-character nodes follow the rule of Tries.</p>
<p>Ok, our first method, <code class="language-plaintext highlighter-rouge">add</code> handles adding a value to the trie, creating necessary parent nodes for our value.
At each iteration, we compare the <code class="language-plaintext highlighter-rouge">i</code>th character of our value, with <code class="language-plaintext highlighter-rouge">i</code>th character of current nodes childrens value,
if we find one, we continue to search the next branch, else, we create a node with <code class="language-plaintext highlighter-rouge">value.slice(0, i + 1)</code> and move onto the created node.</p>
<p>It might be a little hard to grasp at first, so I created a visualization of this method to help you understand it easier, take a look:
<a href="https://mdibaiee.github.io/autocomplete-trie/demo/add.html">Trie Visualization</a></p>
<p>Then we have our find method, which searches for the given value in the trie. The algorithm for searching is the same, comparing by index and moving to the next branch.</p>
<h1 id="example">Example</h1>
<p>Thats it for our simple Trie class, now lets create an actual input with autocomplete functionality using our Trie.</p>
<figure class="highlight"><pre><code class="language-html" data-lang="html"><span class="nt">&lt;input&gt;</span>
<span class="nt">&lt;div</span> <span class="na">class=</span><span class="s">'results'</span><span class="nt">&gt;</span>
<span class="nt">&lt;/div&gt;</span></code></pre></figure>
<p>I put some random names and stuff into three categories, results: <a href="https://mdibaiee.github.io/autocomplete-trie/demo/data.json">data.json</a></p>
<p>Now we have to create a Trie of our data:</p>
<figure class="highlight"><pre><code class="language-javascript" data-lang="javascript"><span class="kd">const</span> <span class="nx">trie</span> <span class="o">=</span> <span class="k">new</span> <span class="nx">Trie</span><span class="p">();</span>
<span class="kd">let</span> <span class="nx">data</span> <span class="o">=</span> <span class="p">{...};</span> <span class="c1">// read from data.json</span>
<span class="k">for</span> <span class="p">(</span><span class="kd">let</span> <span class="nx">category</span> <span class="k">in</span> <span class="nx">data</span><span class="p">)</span> <span class="p">{</span>
<span class="k">for</span> <span class="p">(</span><span class="kd">let</span> <span class="nx">item</span> <span class="k">of</span> <span class="nx">data</span><span class="p">[</span><span class="nx">category</span><span class="p">])</span> <span class="p">{</span>
<span class="kd">let</span> <span class="nx">node</span> <span class="o">=</span> <span class="nx">trie</span><span class="p">.</span><span class="nx">add</span><span class="p">(</span><span class="nx">item</span><span class="p">);</span>
<span class="nx">node</span><span class="p">.</span><span class="nx">category</span> <span class="o">=</span> <span class="nx">category</span><span class="p">;</span>
<span class="p">}</span>
<span class="p">}</span></code></pre></figure>
<p>As simple as that, our trie is made, it looks like this: <a href="https://mdibaiee.github.io/autocomplete-trie/demo/data.html">Data</a></p>
<p>Now, lets actually show results:</p>
<figure class="highlight"><pre><code class="language-javascript" data-lang="javascript"><span class="kd">const</span> <span class="nx">input</span> <span class="o">=</span> <span class="nb">document</span><span class="p">.</span><span class="nx">querySelector</span><span class="p">(</span><span class="dl">'</span><span class="s1">input</span><span class="dl">'</span><span class="p">);</span>
<span class="kd">const</span> <span class="nx">results</span> <span class="o">=</span> <span class="nb">document</span><span class="p">.</span><span class="nx">querySelector</span><span class="p">(</span><span class="dl">'</span><span class="s1">#results</span><span class="dl">'</span><span class="p">);</span>
<span class="nx">input</span><span class="p">.</span><span class="nx">addEventListener</span><span class="p">(</span><span class="dl">'</span><span class="s1">keyup</span><span class="dl">'</span><span class="p">,</span> <span class="p">()</span> <span class="o">=&gt;</span> <span class="p">{</span>
<span class="nx">results</span><span class="p">.</span><span class="nx">innerHTML</span> <span class="o">=</span> <span class="dl">''</span><span class="p">;</span>
<span class="kd">const</span> <span class="nx">nodes</span> <span class="o">=</span> <span class="nx">trie</span><span class="p">.</span><span class="nx">find</span><span class="p">(</span><span class="nx">input</span><span class="p">.</span><span class="nx">value</span><span class="p">);</span>
<span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="nx">nodes</span><span class="p">)</span> <span class="k">return</span><span class="p">;</span>
<span class="k">for</span> <span class="p">(</span><span class="kd">let</span> <span class="nx">node</span> <span class="k">of</span> <span class="nx">nodes</span><span class="p">.</span><span class="nx">children</span><span class="p">)</span> <span class="p">{</span>
<span class="kd">const</span> <span class="nx">category</span> <span class="o">=</span> <span class="nx">node</span><span class="p">.</span><span class="nx">category</span> <span class="p">?</span> <span class="s2">`- </span><span class="p">${</span><span class="nx">node</span><span class="p">.</span><span class="nx">category</span><span class="p">}</span><span class="s2">`</span> <span class="p">:</span> <span class="dl">''</span><span class="p">;</span>
<span class="nx">results</span><span class="p">.</span><span class="nx">innerHTML</span> <span class="o">+=</span> <span class="s2">`&lt;li&gt;</span><span class="p">${</span><span class="nx">node</span><span class="p">.</span><span class="nx">value</span><span class="p">}</span><span class="s2"> </span><span class="p">${</span><span class="nx">category</span><span class="p">}</span><span class="s2">&lt;/li&gt;`</span><span class="p">;</span>
<span class="p">}</span>
<span class="p">});</span></code></pre></figure>
<p><a href="https://mdibaiee.github.io/autocomplete-trie/1.html">Autocomplete 1</a>
<img src="/img/autocomplete-1.png" alt="Autocomplete 1" />
This will only show the instant-childs of the word entered, but thats not what we want, we want to show <em>complete</em> words, how do we do that?</p>
<p>First, we need a way to detect complete words, we can have a flag to recognize complete words, we can modify our <code class="language-plaintext highlighter-rouge">add</code> method to
automatically flag whole words or we can manually add the flag after adding the node, as we did by setting a category for our words,
so we already have a flag to recognize whole words, thats our <code class="language-plaintext highlighter-rouge">category</code> property, now lets add a new method to our Trie class to find
whole words.</p>
<figure class="highlight"><pre><code class="language-javascript" data-lang="javascript"><span class="p">...</span>
<span class="nx">findWords</span><span class="p">(</span><span class="nx">value</span><span class="p">,</span> <span class="nx">parent</span> <span class="o">=</span> <span class="k">this</span><span class="p">.</span><span class="nx">root</span><span class="p">)</span> <span class="p">{</span>
<span class="kd">let</span> <span class="nx">top</span> <span class="o">=</span> <span class="k">this</span><span class="p">.</span><span class="nx">find</span><span class="p">(</span><span class="nx">value</span><span class="p">,</span> <span class="nx">parent</span><span class="p">);</span>
<span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="nx">top</span><span class="p">)</span> <span class="k">return</span> <span class="p">[];</span>
<span class="kd">let</span> <span class="nx">words</span> <span class="o">=</span> <span class="p">[];</span>
<span class="nx">top</span><span class="p">.</span><span class="nx">children</span><span class="p">.</span><span class="nx">forEach</span><span class="p">(</span><span class="kd">function</span> <span class="nx">getWords</span><span class="p">(</span><span class="nx">node</span><span class="p">)</span> <span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="nx">node</span><span class="p">.</span><span class="nx">category</span><span class="p">)</span> <span class="nx">words</span><span class="p">.</span><span class="nx">push</span><span class="p">(</span><span class="nx">node</span><span class="p">);</span>
<span class="nx">node</span><span class="p">.</span><span class="nx">children</span><span class="p">.</span><span class="nx">forEach</span><span class="p">(</span><span class="nx">getWords</span><span class="p">);</span>
<span class="p">});</span>
<span class="k">return</span> <span class="nx">words</span><span class="p">;</span>
<span class="p">}</span>
<span class="p">...</span></code></pre></figure>
<p>And change our event listener like so:</p>
<figure class="highlight"><pre><code class="language-javascript" data-lang="javascript"><span class="kd">const</span> <span class="nx">input</span> <span class="o">=</span> <span class="nb">document</span><span class="p">.</span><span class="nx">querySelector</span><span class="p">(</span><span class="dl">'</span><span class="s1">input</span><span class="dl">'</span><span class="p">);</span>
<span class="kd">const</span> <span class="nx">results</span> <span class="o">=</span> <span class="nb">document</span><span class="p">.</span><span class="nx">querySelector</span><span class="p">(</span><span class="dl">'</span><span class="s1">#results</span><span class="dl">'</span><span class="p">);</span>
<span class="nx">input</span><span class="p">.</span><span class="nx">addEventListener</span><span class="p">(</span><span class="dl">'</span><span class="s1">keyup</span><span class="dl">'</span><span class="p">,</span> <span class="p">()</span> <span class="o">=&gt;</span> <span class="p">{</span>
<span class="nx">results</span><span class="p">.</span><span class="nx">innerHTML</span> <span class="o">=</span> <span class="dl">''</span><span class="p">;</span>
<span class="kd">const</span> <span class="nx">nodes</span> <span class="o">=</span> <span class="nx">trie</span><span class="p">.</span><span class="nx">findWords</span><span class="p">(</span><span class="nx">input</span><span class="p">.</span><span class="nx">value</span><span class="p">);</span> <span class="c1">// &lt;&lt; Change</span>
<span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="nx">nodes</span><span class="p">.</span><span class="nx">length</span><span class="p">)</span> <span class="k">return</span><span class="p">;</span> <span class="c1">// &lt;&lt; Change</span>
<span class="k">for</span> <span class="p">(</span><span class="kd">let</span> <span class="nx">node</span> <span class="k">of</span> <span class="nx">nodes</span><span class="p">)</span> <span class="p">{</span> <span class="c1">// &lt;&lt; Change</span>
<span class="kd">const</span> <span class="nx">category</span> <span class="o">=</span> <span class="nx">node</span><span class="p">.</span><span class="nx">category</span> <span class="p">?</span> <span class="s2">`- </span><span class="p">${</span><span class="nx">node</span><span class="p">.</span><span class="nx">category</span><span class="p">}</span><span class="s2">`</span> <span class="p">:</span> <span class="dl">''</span><span class="p">;</span>
<span class="nx">results</span><span class="p">.</span><span class="nx">innerHTML</span> <span class="o">+=</span> <span class="s2">`&lt;li&gt;</span><span class="p">${</span><span class="nx">node</span><span class="p">.</span><span class="nx">value</span><span class="p">}</span><span class="s2"> </span><span class="p">${</span><span class="nx">category</span><span class="p">}</span><span class="s2">&lt;/li&gt;`</span><span class="p">;</span>
<span class="p">}</span>
<span class="p">});</span></code></pre></figure>
<p><a href="https://mdibaiee.github.io/autocomplete-trie/2.html">Autocomplete 2</a>
<img src="/img/autocomplete-2.png" alt="Autocomplete 2" />
Ta-daa!</p>
<p>We have our autocomplete working! Lets add zsh-like-tab-to-next-char functionality.</p>
<figure class="highlight"><pre><code class="language-javascript" data-lang="javascript"><span class="nx">input</span><span class="p">.</span><span class="nx">addEventListener</span><span class="p">(</span><span class="dl">'</span><span class="s1">keydown</span><span class="dl">'</span><span class="p">,</span> <span class="nx">e</span> <span class="o">=&gt;</span> <span class="p">{</span>
<span class="c1">// Tab Key</span>
<span class="k">if</span> <span class="p">(</span><span class="nx">e</span><span class="p">.</span><span class="nx">keyCode</span> <span class="o">===</span> <span class="mi">9</span><span class="p">)</span> <span class="p">{</span>
<span class="nx">e</span><span class="p">.</span><span class="nx">preventDefault</span><span class="p">();</span>
<span class="kd">const</span> <span class="nx">current</span> <span class="o">=</span> <span class="nx">trie</span><span class="p">.</span><span class="nx">find</span><span class="p">(</span><span class="nx">input</span><span class="p">.</span><span class="nx">value</span><span class="p">);</span>
<span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="nx">current</span><span class="p">.</span><span class="nx">children</span><span class="p">.</span><span class="nx">length</span><span class="p">)</span> <span class="k">return</span><span class="p">;</span>
<span class="nx">input</span><span class="p">.</span><span class="nx">value</span> <span class="o">=</span> <span class="nx">current</span><span class="p">.</span><span class="nx">children</span><span class="p">[</span><span class="mi">0</span><span class="p">].</span><span class="nx">value</span><span class="p">;</span>
<span class="p">}</span>
<span class="p">});</span></code></pre></figure>
<p>Thats it! We have an input with autocomplete and tab-to-next-char. Isnt it awesome?</p>
<p><a href="https://mdibaiee.github.io/autocomplete-trie/2.html">Final Result</a></p>
<p><em>Pst! I have a repository of algorithm implementations in ES6, you might want to take a look! <a href="https://github.com/mdibaiee/harmony-algorithms">mdibaiee/harmony-algorithms</a></em></p>
</article>
<div class="share-page">
Share in
<a href="https://twitter.com/intent/tweet?text=Autocomplete using Tries&url=http://localhost:4000/autocomplete-predict-trie/&via=&related=" rel="nofollow" target="_blank" title="Share on Twitter">Twitter</a>
<a href="https://facebook.com/sharer.php?u=http://localhost:4000/autocomplete-predict-trie/" rel="nofollow" target="_blank" title="Share on Facebook">Facebook</a>
<a href="https://plus.google.com/share?url=http://localhost:4000/autocomplete-predict-trie/" rel="nofollow" target="_blank" title="Share on Google+">Google+</a>
</div>
<div id="commento"></div>
<script defer
src="//commento.mahdi.blog/js/commento.js">
</script>
<script src="/js/heading-links.js"></script>
</div>
</div>
</div>
</body>
</html>