Sunday, October 30, 2011

Using Roistr to establish meaning

At its heart, Roistr's semantic relevance engine is a very powerful tool. Here, I'll talk about how even the limited web-based demonstration can be used to extract meaning.

Let's take this well-known sentence: The quick brown fox jumps over the lazy dog.

Our question is this: how can we work out what are the key words in this sentence - the words that most indicate what it's about?

Word meaning is established in several ways but primarily through its own context. The words that surround it are the most likely way to understand what a word means. There are other things such as the user's own context (the pre-existing knowledge they have) but for now, we'll concentrate just on each word's own context.

So what do we do? We take each word and compare it against the sentence. The first screenshot shows how this is done. The entire sentence is the category definition and each meaning word is compared against it. You can try this yourself.



Then we compare and look at the similarity scores.

The second figure shows the similarities of each word to its parent sentence, all ranked in descending order. What do we see?

The most striking thing is that the word 'fox has the highest similarity (0.62) quickly followed by 'dog' (0.58). Both of these are quite high and would be lower for longer documents.

These similarity scores tell us that the two words that are most important in the sentences meaning are 'fox' and 'dog' respectively. If we were to summarise this sentence in just one word, we would use the word 'fox' with 'dog' coming a close second.

This is just a toy test but does illustrate one use to which Roistr's semantic relevance engine can be put.