Growing a Trie in JS for Some Slick Word Searches

A trie is nifty little data structure that can be used to very quickly and efficiently check for the spelling of a word.

A little mind exercise: Imagine you're playing a very high stakes game of Scrabble against Grandma and Aunt Helen and you just can't bear to be beaten again. So, you do what any other good Dev would do, and write a computer program to do the heavy lifting for you.

Assuming you have all the permutations of the letters passed into your program, how would you check which ones were valid super high scoring words? Comparing each against a list of dictionary words would be prohibitively slow and Aunt Helen would quickly become suspicious.

That's where the trie structure comes in. A trie is composed of linked nodes which each hold a value, in this case a letter, and links to any subsequent letters that would continue to from a valid word. In this case, the nodes will also hold a property letting us know if that node represents the end of a valid word, so we know when to stop looking.

Essence of a Trie

For example, the word tea (a paltry 3 points, I know) would be composed of four nodes. The root node up at the top is the point of all departure. It hold references to all the possible first letters of any words (in the diagrams case, t, A and i). To check if the word tea exists in the list, we would first look at the root node for a reference to any t child nodes. In this case there is one, so we check the t node for a reference to e, the second letter in our word. That exists, as does a subsequent reference to a. Once we reach the last letter in our word, we will have to check to see if the final node corresponds to an entire word. If it does, we know we have a match!

Trie diagram courtesy of Wikipedia

Show Me the Code, Bro

OK, that's some lovely CS101 fluffiness, but how do we do something with it. JS can nicely model a Trie structure using Objects.

The basic object will have a letter property, corresponding to the letter that node represents, and an isWord property which will be set to true if that node represents the end of a complete word. Letter also defaults to and empty string to represent the root node in the Trie.

var Trie = function(letter) {  
  this.letter = letter || '';
  this.isWord = false;
};

Where it really gets good is inserting new words into our Trie structure. To do that we'll add an insert method to the Trie.prototype.

// Adds a new word into the prefix tree
Trie.prototype.insert = function(word) {  
  var currentNode = this;
  var newNode;
  var ch;
  // Loop through all the letters of the word to be inserted.
  for (var i = 0; i < word.length; i++) {
    ch = word.charAt(i);
    // Check if the current node already has a property corresponding to the next letter
    if (currentNode[ch]) {
      // If it does have that letter already
      // Move current node on to the next child node corresponding to next letter
      currentNode = currentNode[ch];
      // If Next letter node DOESN'T exists
    } else {
      // create a new child node for the next letter
      // move current node on to the new node
      newNode = new Trie(ch);
      currentNode[ch] = newNode;
      currentNode = newNode;
    }
    // When the last letter is reached, set the current node's 
    // isWord property to true to represent the end of a complete
    // word.
    if (i === word.length - 1) {
      currentNode.isWord = true;
    }
  }
};

This method will take in a word parameter and will insert it into our Trie. To do that, it will loop over each letter in the new word, represented as ch, moving down the trie as it goes. To model the connections, we simply use properties on the Trie object with the letter as the key name and the next node as the value.

Using the diagram above as an example, that root node would look like this:

{ 
  t: Trie,
  A: Trie,
  i: Trie,
  isWord: false,
  letter: ''
}

Starting at the root node as the first value for currentNode, it will check if there is a property corresponding to the first letter in the word. If that property already exists, we'll set it as currentNode and move on to the next letter. If it doesn't exist, we'll create a new Trie object with its letter property set to ch, set it as currentNode and move on to the next letter.

    if (currentNode[ch]) {
      // If it does have that letter already
      // Move current node on to the next child node corresponding to next letter
      currentNode = currentNode[ch];
      // If Next letter node DOESN'T exists
    } else {
      // create a new child node for the next letter
      // move current node on to the new node
      newNode = new Trie(ch);
      currentNode[ch] = newNode;
      currentNode = newNode;
    }

Finally, we'll add a check to set the isWord property to true when we've reached the last letter.

// When the last letter is reached, set the current node's 
    // isWord property to true to represent the end of a complete
    // word.
    if (i === word.length - 1) {
      currentNode.isWord = true;
    }

To make use of the insert function, we'll create a new instance of a Trie, which will represent our root node, and then we can drop in words to our heart's content.

var trie = new Trie();  
trie.insert('crepuscular');  

The insert method will allow us to build out a structure similar to the diagram above.

What Goes in Must Come Out

Great, we can add words, but how do we know what's in there?

To search through the existing words, we'll need another method. We'll create a contains method which will take a string as an argument. It will search through the trie and return a boolean value representing if the string is a word in the trie or not.

Trie.prototype.contains = function(str) {  
  var currentNode = this;
  var ch;

  // Loop through each letter in the string
  for (var i = 0; i < str.length; i++) {
    ch = str.charAt(i);
    // Check if the current node has a link to that letter
    if (!currentNode[ch]) {
      // If it doesn't, str isn't a valid word so return false
      return false;
    } else {
      // If it is a valid letter, move on to the next node.
      currentNode = currentNode[ch];
    }
    // If we reach the end of the string and the current node
    // represents a complete word, return true.
    if (i === str.length - 1 && currentNode.isWord) {
      return true;
    }
  }
  // If we reach the end of the string, but it's not a complete
  // word, return false.
  return false;
};

Now we can check if our trie structure contains a given word.

var trie = new Trie();  
trie.insert('crepuscular');  
trie.insert('hirsute');  
trie.insert('cheese');

trie.contains('cheese') // ---> true  
trie.contains('chee'); // ---> false  
trie.contains('askldf'); // ---> really false  
trie.contains('tubular'); // ---> false because we didn't add it yet  

What Next

So we have trie, but we're not that much closer to having something that can help us school Grandma in Scrabble. The trie is a necessary step though. In order to find all possible scrabble words, we have to test all permutations of seven letters. That's a lot of look ups! Although a trie needs a considerable amount of time to preprocess all the words in the dictionary, when that's complete we have the ability to check if a string is a valid word with constant time look up! Instead of looping through the entire dictionary to check each word, we only have to look at as many nodes as letters in the string.

Next up, I'll go over an algorithm that will find and test all combinations of letters in a Scrabble hand.