1. Home
  2. Science
  3. Sparse Distributed Representations

Sparse Distributed Representations (SDRs)

Semantic fingerprints are word-SDRs

What is a SDR? (Sparse Distributed Representation)

According to recent findings in neuroscience, the brain uses Sparse Distributed Representations (SDRs) to process information. This is true for all mammals, from mice to humans. The SDRs visualize the information processed by the brain at a given moment, each active cell bearing some semantic aspect of the overall message.

Sparse means that only a few of the many (thousands of) neurons are active at the same time, in contrast to the typical “dense” representation, in computers, of a few bits of 0s and 1s.

Distributed means that not only are the active cells spread across the representation, but the significance of the pattern is too. This makes the SDR resilient to the failure of single neurons and allows sub-sampling.

As each bit or neuron has a meaning, if the same bit is active in two SDRs, it means that they are semantically similar: that is the key to our computational approach.

A new model for intelligent text processing

How Cortical.io uses SDRs

The researcher Jeff Hawkins uses SDRs to train a neuro-computational model called Hierarchical Temporal Memory or HTM. This HTM network recognizes and memorizes sequences of sparse bit-patterns, enabling the trained net to make predictions on what to expect next after a given input sequence.

The fundamental concept behind Cortical.io’s Retina consists in converting the symbolic representation of language into a SDR form so that it becomes numerically computable. Language is first decomposed into its atoms, the words. The terms are then converted into SDRs by the Retina Engine.

Every word is characterized by a bundle of 16,000 semantic features, that capture the lexical semantics of a word in its natural context. Two words become similar in their SDR form if they are conceptually related. Semantic similarity becomes directly commensurable even using a simple measure like the Euclidean Distance.

An equivalent to the semantic vector model built by Cortical.io’s Retina can be found in Information Retrieval literature under the name of Word Space. This was first described by Hinrich Schütze; also see Distributional Semantics.

An excellent overview of the potential of the word-space model is given in Magnus Sahlgren’s dissertation entitled The Word-Space Model.

Background

Recently, a new aspect of brain research has received more and more attention. “Computational Neuroscience” is not so much about the biological and biochemical characteristics of neurons as about their computational logistics. Very ambitious projects like Henry Markram’s Blue Brain Project try to reveal the brains functioning by reconstructing it at the synapse-level in the form of a computer model. The Blue Brain tries to model what is called a cortical column, a repetitive structure observed in all mammal brains. These columns were first described by Vernon B. Mountcastle as “Hypercolumns”. A good starting point for a deeper understanding is a paper from Pasko Rakic named Confusing cortical columns.

During its evolution the neocortex was heavily driven by the needs of the visual system. The impact of this influence was so strong that one could think of the visual system playing the role of a “Leitbild”. Reverse-thinking the visual system shows the role of structures like the Thalamus as a gateway and switchboard for sensorial inputs or the Hippocampus as the possibly highest level of a hierarchical memory system. The signals representing information in the brain pathways also merit a closer look: The brain processes information using Sparse Distributed Representations. These SDRs are in fact the key to a better understanding of the brain’s computational approach.

We provide a growing list of articles and videos on topics related to the field of Computational Neuroscience: go to our media resources.