Wednesday, January 16, 2013

Prolog and NLP

Prolog has been a fun language to use. It's like how I thought programming computers would be before I ever programme. You tell it things (facts and rules) and then you'd ask it questions to find answers.

In natural language processing, Prolog used to be very common but much less so now with languages such as Python and Java taking a lot of attention.

But, of course, if the only tool you know is a hammer, then every problem is solved with a nail. What Prolog and other declarative languages can do is exceptional. I don't see it as the entire solution. At Roistr, we've always said that Python was our mainstay language that we use to create statistical models of text. Computational linguistics has fallen out of favour somewhat lately, but I wondered how to mix the two together.

As an introduction, Prolog can take statements and return true / yes (it's logically possible), or false / no (it's not). There are other returns but we can worry about those some other time.

The concern I have is that Prolog doesn't have the grey areas that naturally occur in real life. My thoughts were if we could create a Prolog engine and supply it with explicit facts and rules. Queries that rely solely upon these facts and rules would be dealt with as normal. However, it would have the ability to deal with implicit knowledge by consulting a semantic relevance engine.

So if it was give facts about the things that John owns but doesn't mention a car, the semantic relevance engine could step in and realise that 'car' is almost synonymous with the 'automobile' fact that does appear.

It could also handle unknown facts and infer rules from the semantic association the engine provides. Say it's provided with a fact about John that says he's in France. When asked if he's in Europe, it would search its explicit knowledge base which returns false, or rather more correctly unknown (because it's not explicitly stated or ruled that France is in Europe), but the engine would note the close association between the concepts, 'France' and 'Europe' and allow the program to infer a likelihood.

So let's work this into an example:

?- john(europe).
Explicit false
Implicit 0.87 true

So the semantic relevance engine would be a back-up knowledge base to be used to provide factual knowledge and inference. It's a bit bizarre and odd to think of and needs a lot of work before the validity of this notion could be tested.

SEO Link Checking with Semantic Relevance

How can you tell if inbound or outbounds links point to sites that Goole penalise you for? Is there a way to compare the content of sites that you link to or sites that link to you to see if they're damaging your rankings?

When a site links to you, it's easy to check manually to see if it's a decent site or not. After all, you just have to go into your SEO software and click on the link...

But this doesn't scale well. If you're an SEO consultant or agency with hundreds or thousands of sites, it's impossible to check them all.

Roistr's semantic relevance engine is a good candidate for this task. It can analyse your site and work out its overall 'gist'. It can even compare each page against this gist to make sure your content is reasonably consistent.

But it can also take other sites' content and compare it against yours. This way, you can see if your blog about immigration is being linked to by sites concerned with immigration - or if they're just trying to sell slippers.

We're currently working on providing a nice demonstration. You'll be able to enter a URL and we will compare it against some other sites so you can see if they're on-topic to you or not.

And if it's any good, you can contact us to automate a link-checking solution for all your sites!