Sparse Distributed Representations

Semantic fingerprints are word-SDRs.

What is a SDR?

According to recent findings in neuroscience, the brain uses Sparse Distributed Representations (SDR) to process information. This is true for all mammals, from mice to humans. The SDRs visualize the information processed by the brain at a given moment, each active cell bearing some semantic aspect of the overall message.

Sparse means that only a few of the many (thousands of) neurons are active at the same time, in contrast to the typical “dense” representation, in computers, of a few bits of 0s and 1s.

Distributed means that not only are the active cells spread across the representation, but the significance of the pattern is too. This makes the SDR resilient to the failure of single neurons and allows sub-sampling.

As each bit or neuron has a meaning, if the same bit is active in two SDRs, it means that they are semantically similar: that is the key to our computational approach.

How Cortical.io uses SDRs

The researcher Jeff Hawkins uses SDRs to train a neuro-computational model called Hierarchical Temporal Memory or HTM. This HTM network recognises and memorises sequences of sparse bit-patterns, enabling the trained net to make predictions on what to expect next after a given input sequence.

The fundamental concept behind Cortical.io’s Retina consists in converting the symbolic representation of language into a SDR form so that it becomes numerically computable. Language is first decomposed into its atoms, the words. The terms are then converted into SDRs by the Retina.

Every word is characterised by a bundle of 16,000 semantic features, that capture the lexical semantics of a word in its natural context. Two words become similar in their SDR form if they are conceptually related. Semantic similarity becomes directly commensurable even using a simple measure like the Euclidean Distance.

An equivalent to the semantic vector model built by Cortical.io’s Retina can be found in Information Retrieval literature under the name of Word Space. This was first described by Hinrich Schütze; also see Distributional Semantics.

An excellent overview of the potential of the word-space model is given in Magnus Sahlgren’s dissertation named The Word-Space Model.

View video:

Background

Recently, a new aspect of brain research has received more and more attention. “Computational Neuroscience” is not so much about the biological and biochemical characteristics of neurons as about their computational logistics. Very ambitious projects like Henry Markram’s Blue Brain Project try to reveal the brains functioning by reconstructing it at the synapse-level in the form of a computer model. The Blue Brain tries to model what is called a cortical column, a repetitive structure observed in all mammal brains. These columns were first described by Vernon B. Mountcastle as “Hypercolumns”. A good starting point for a deeper look is a paper from Pasko Rakic named Confusing cortical columns.

During its evolution the neocortex was heavily driven by the needs of the visual system. The impact of this influence was so strong that one could think of the visual system playing the role of a “Leitbild”. Reverse-thinking the visual system shows the role of structures like the Thalamus as a gateway and switchboard for sensorial inputs or the Hippocampus as the possibly highest level of a hierarchical memory system. The signals representing information in the brain pathways also merit a closer look: The brain processes information using Sparse Distributed Representations. These SDRs are in fact the key to a better understanding of the brain’s computational approach.

We provide a growing list of articles and videos of interests related to the field of Computational Neuroscience: read more…