28 Mar 2017 Fifty shades of python

If I say “python”, what is the first image that comes to your mind? Do you imagine the sinuous shape of a giant snake or do you think of programming language? Did the image of a luxury bag pop up on your retina or did you grin recalling the caustic humour of a surreal British comedy group?

The concept we associate first with a word clearly gives some indications about our background and interests. If you first thought was the programming language, you will likely be working in the field of information technology, probably close to software development. Because you live in an environment where the word “python” is repeatedly used in the context of programming, your mental semantic map has stored more utterances for this context than for any other. This is the reason why the software sense is the first to come to your mind. If you are given more information in the form of a sentence like “This company is manufacturing truly chic, genuine python handbags”, another, less populated area of your semantic map will lighten up (and probably spark some kind of disapproving reaction).

Thanks to the semantic map stored in our brain, human beings are able to disambiguate word senses very easily. But what about computers? Most of them are just looking for keywords in texts, which means that they cannot differentiate between different contexts. You have to enter dozens of keywords in your system in order to refine search results. That’s not only tedious but also unreliable: you can never be sure that you have typed all necessary keywords to retrieve all relevant documents. Not to mention the flood of irrelevant hits you are drowning in.

What computers need to process the meaning of natural language is a kind of semantic map. At Cortical.io, this is exactly what we do every day: we build semantic maps for computers so that they can disambiguate word senses, identify contexts and understand the meaning hidden in unstructured text data.

The best thing is, you don’t need to be a software developer to see how it works. I cannot program a single line of code, yet I disambiguated the many contexts of the word “python” in just one click, using the Cortical.io online Topic Explorer demo.

Here are the results:

Click on the image to see the demo results.

Click on the image to see the demo results.

With a single click, the Topic Explorer identifies a total of 7 contexts and lists the associated similar terms:

  • “Software”: functionality, implementations, mac os (software implementation context)

  • “Syntax”: perl, programmers (programming language context)

  • “Species”: mammals, lizards, monkeys (zoology context)

  • “Open source”: plugin, gui (open source context)

  • “Targets”: capability, long-range (military context)

  • “Shorts”: gag, monologue, nickelodeon (comedy context)

  • “Numbers”: integers, notation (variable types context)

I refined the results by playing around with expressions, using a Boolean operator like “+” to add or “-“ to subtract meaning. Imagine this like a playful semantic exploration journey: you remove or add a sense and see what happens. Will any new context appear at some point? It actually did: I discovered the context of attraction parks (roller-coasters named Python) (using the “-“ Boolean operator in the expression: “python” - “species” - “series” - “targets”) and even the fashion context (“python” - “software” - “syntax” - “function” - “show” - “browser” - “target” - “numbers”). Note that the engine behind the Topic Explorer is a demo version, trained with a static edition of Wikipedia. The results would be different (better) with a commercial version, trained with a different corpus.

The Topic Explorer demo gives a glimpse of the infinite possibilities offered by Cortical.io’s technology to explore the semantic shades of natural language. Some of the companies we work with use it to find information they could not access before; others to extract topics and, for example, better react to customers’ requests. Many of our customers had tried other approaches before – with little or no success.

If you, too, are convinced that there is more than black and white in your big text data but don’t know yet how to see the full color palette, then just get in touch!

Author: Marie-Pierre Garnier,
VP Marketing & Communications
back to opinions overview