#Sparse Distributed Representations
####Semantic Fingerprints are word-SDRs.

What is a SDR?

According to recent findings in neuroscience, the brain uses Sparse Distributed Representations (SDR) to process information. This is true for all mammals, from mice to humans. The SDRs visualize the information processed by the brain at a given moment, each active cell bearing some semantic aspect of the overall message.

Sparse means that only a few of the many (thousands of) neurons are active at the same time, in contrast to the typical “dense” representation, in computers, of a few bits of 0s and 1s.

Distributed means that not only are the active cells spread across the representation, but the significance of the pattern is too. This makes the SDR resilient to the failure of single neurons and allows sub-sampling.

As each bit or neuron has a meaning, if the same bit is active in two SDRs, it means that they are semantically similar: that is the key to our computational approach.

How cortical.io uses SDRs

The researcher Jeff Hawkins uses SDRs to train a neuro-computational model called Cortical Learning Algorithm (CLA) –formerly called “Hierarchical Temporal Memory” or HTM. This CLA network recognises and memorises sequences of sparse bit-pattern, enabling the trained net to make predictions on what to expect next after a given input sequence.

The fundamental concept behind cortical.io’s Retina consists in converting the symbolic representation of language into a SDR form so that it becomes numerically computable. Language is first decomposed into its atoms, the words. The terms are then converted into SDRs by the Retina.

Every word is characterised by a bundle of 16,000 semantic features, that capture the lexical semantics of a word in its natural context. Two words become similar in their SDR form if they are conceptually related. Semantic similarity becomes directly commensurable using even a simple measure like the Euclidian Distance.

An equivalent to the semantic vector model built by cortical.io’s Retina can be found in Information Retrieval literature under the name of Word Space. This was first described by Hinrich Schütze; also see his Distributional Semantics.

An excellent overview of the potential of the word-space model is given in Magnus Sahlgren’s dissertation named The Word-Space Model.

###View video: