Contexts

Natural Language Processing beyond statistics and linguistics

cortical.io’s approach is inspired by the latest findings on the way the human cortex works. “By mimicking the understanding process of the brain, we benefit from millions of years of evolutionary engineering to help us solve the hottest NLP challenges today” explains Francisco Webber, inventor and co-founder.

Our technology, the Cortical Engine for Processing Text, breaks with traditional methods based on pure word count statistics or linguistic rule engines.

Semantic Fingerprinting

The central component of our technology is called the Retina. It encodes words in the same way as sensorial information is fed into the brain, using a SDR (sparse-distributed representation). While traditional systems are based on counting words, cortical.io’s Retina uses a substantially finer-grained representation for every word: 16,000 semantic features are captured for every term. The Retina then generates semantic fingerprints of language elements like words, sentences, or whole documents. These semantic fingerprints help to identify the meaning behind natural language and allow direct computational comparison between any pieces of text.

Background

In the past years a new aspect of brain research has received more and more attention. “Computational Neuroscience” is not so much about the biological and biochemical characteristics of neurons as about their computational logistics. Very ambitious projects like Henry Markram’s Blue Brain Project try to reveal the brains functioning by reconstructing it at the synapse-level in the form of a computer model.

The Blue Brain tries to model what is called a cortical column, a repetitive structure observed in all mammal brains. These columns were first described by Vernon B. Mountcastle as “Hypercolumns”. A good starting point for a deeper look is a paper from Pasko Rakic named Confusing cortical columns.

During its evolution the neocortex was heavily driven by the needs of the visual system. The impact of this influence was so strong that one could think of the visual system playing the role of a “Leitbild”. Reverse-thinking the visual system shows the role of structures like the Thalamus as a gateway and switchboard for sensorial inputs or

 

the Hippocampus as the possibly highest level of a hierarchical memory system. The signals representing information in the brain pathways also merit a closer look: The brain processes information using Sparse Distributed Representations. These SDRs are in fact the key to a better understanding of the brain’s computational approach.

Read Articles:

View Videos:

Jeff Hawkins is an essential source of inspiration for our work. In his book “On Intelligence”, he explains how the brain represents sensory input in Sparse Distributed Representations (SDR).

We are active members of the NuPIC (Numenta Platform for Intelligent Computing) community, an open-source project which makes the Cortical Learning Algorithm (CLA) available to software developers.

CLA is an online learning system modeled on how the neocortex performs tasks such as visual pattern recognition or understanding spoken language. Unlike traditional computer systems, it does not require conventional training and testing of data sets. Like the brain, the CLA uses SDRs to store information.

###Related Sites:

Numenta