Cortical.io has wrapped its Retina Engine into an easy-to-use, powerful platform for fast semantic search, semantic classification and semantic filtering.
The Retina Platform is an ecosystem that offers access to the core Retina Engine via the Retina Library (formerly called Retina Spark) and the Retina API. Both enable fundamental operations on text like text comparison, keyword extraction, segmentation or context identification. The Retina Library includes additional modules for semantic search, dynamic generation of classifiers and Big Data semantic processing.
Both Retina API and Retina Library are provided with a default database for general English which can be replaced with any other language or domain-specific semantic space.
Retina API and Retina Library serve as a central resource for a multitude of different applications that elegantly solve difficult NLU tasks where other approaches struggle.Read more
The latest IDC Innovator research report, IDC Innovators: Machine Learning-Based Text Analytics, 2016 (Doc # US41312116, May 2016) recognizes Cortical.io as an IDC Innovator.
Cortical.io’s approach to understanding language, Semantic Folding, is inspired by the latest findings on the way the brain processes information. It proposes a statistics-free processing model based on semantic fingerprints, a new data representation that encodes meaning explicitly, including all senses and contexts. Cortical.io’s Retina engine learns any language by ingesting relevant text content via unsupervised learning.
Cortical.io’s Retina engine reduces complex text analytics to the application of a simple similarity function. This makes the system both highly scalable and very intuitive to use. With 16,000 semantic features, the Retina engine performs fine-grained deep semantics on any unstructured text and is completely language independent. Because it is highly efficient, the Retina Engine reduces drastically the computing power needed to process terabytes of data. It enables high performance semantic processing applications like on-the-fly classification, streaming text filtering or semantic search by analogy and opens up a new range of applications for social media monitoring, enterprise search, forensic text analytics, information discovery, compliance monitoring and much more.
IDC Innovators reports present a set of vendors chosen by an IDC analyst within a specific market that offer an innovative new technology, a groundbreaking approach to an existing issue, and/or an interesting new business model. It is not an exhaustive evaluation of all companies in a segment or a comparative ranking of the companies. IDC INNOVATOR and IDC INNOVATORS are trademarks of International Data Group, Inc.Read more
A recent academic study conducted by researchers from Leiden, Ben-Gurion and Toulouse Universities examined the performance of Cortical.io’s Semantic Folding approach for content analysis in a finance setting. Compared to the commonly used word-list method, Semantic Folding proved to have greater predictive power. Its other advantages were speed and ease of use.
“Like the human brain, our Semantic Folding engine learns a language and understands the meaning of text by making analogies. Like the brain, it is both efficient and accurate. We are thrilled to see these compelling results confirmed by an independent academic study”, comments Francisco Webber, inventor and co-founder of Cortical.io.
The research team used Cortical.io’s Retina API to create semantic fingerprints of the 30 Dow Jones Industrial Average constituents, based on business description sections of the companies’ annual reports. For each pair of companies, the similarity of their semantic fingerprints was compared to predict correlations between their stock returns over the following year.
The study found Semantic Folding to have greater predictive power than the traditional word-list based approach. Moreover, fingerprint similarity continued to significantly predict stock return correlations even when other measures of company similarity were controlled for.
The authors contend that Semantic Folding is simpler to use, has lower setup costs, and runs faster than the standard word-list based method. In addition, semantic fingerprints were considered to have an appealing visual interpretation. The authors argue that Semantic Folding significantly lowers the entry barriers for investigators interested in applying content analysis to financial data. To this end, the study includes sample code and suggests possible applications of Cortical.io’s Semantic Folding engine in several finance contexts.
The study, entitled “Using Semantic Fingerprinting in Finance” is available here.Read more
Cortical.io presents Retina Spark 2.0, an NLP tool specially designed for high performance semantic text processing in an Apache Spark environment. Similar to the Retina API, it operates on the semantic rather than the keyword level and measures the similarity in meaning between text passages in order to classify, filter and search large document repositories.
Retina Spark 2.0 enables the creation of:
Retina Spark 2.0 is a library that augments Spark MLlib with high-performance semantic text processing capabilities. It is Cloudera certified and can be used with on-premise or in-the-cloud Spark clusters, including those based on the Cloudera and Amazon EMR distributions. Retina Spark 2.0 supports the latest Apache Spark releases and features a Java and Scala API.
Apache Spark is an open source framework and runtime environment for distributed and parallel computing.
It is common scientific practice to investigate phenomena, which cannot be explained by an existing set of theories, scientifically by applying statistical methods. This is how medical research has led to coherent treatment procedures, which provided a great deal of usefulness to patients. By observing many cases of a disease and by identifying and accounting its various cause and effect relationships, the statistical evaluation of these records allowed to make thoughtful predictions and to find adequate treatments as countermeasures. Nevertheless, since the rise of molecular biology and genetics, we can observe how medical science moves from the time-consuming trial and error strategy to a much more efficient, deterministic procedure that is grounded on solid theories and will eventually lead to a fully personalized medicine.
The science of language had a very similar development. In the beginning, extensive statistics analyses led to a good analytical understanding of the nature and the functioning of human language and culminated in the discipline of linguistics. With the increasing involvement of computer science into the field of linguistics, it turned out that the observed linguistic rules were extremely hard to use for the computational interpretation of language. In order to allow computer systems to perform language based tasks comparable to humans, a computational theory of language was needed and as no such theory was available, research turned again towards a statistical approach by creating various computational language models derived from simple word count statistics. Although there were initial successes, statistical Natural Language Processing (NLP) suffers two main flaws: The achievable precision is always lower than the one of humans and the algorithmic frameworks are chronically inefficient.
The Semantic Folding Theory (SFT) is the attempt to develop an alternative computational theory for the processing of language data. While nearly all current methods of processing natural language based on its meaning use in some form or other word statistics, Semantic Folding uses a neuroscience rooted mechanism of distributional semantics. After capturing a given semantic universe of a reference set of documents by means of a fully unsupervised mechanism, the resulting semantic space is folded into each and every word-representation vector. These vectors are large, sparsely filled binary vectors. Every feature bit in this vector not only corresponds but also equals a specific semantic feature of the folded-in semantic space and is therefore semantically grounded. The resulting word-vectors are fully conforming to the requirements for valid word- SDRs (Sparse Distributed Representation) in the context of the Hierarchical Temporal Memory (HTM) theory by Jeff Hawkins. While the HTM theory focuses on the cortical mechanism for identifying, memorizing and predicting reoccurring sequences of SDR patterns, the Semantic Folding theory describes the encoding mechanism that converts semantic input data into a valid SDR format, directly usable by HTM networks.
The main advantage of using the SDR format is that it allows any data-items to be directly compared. In fact, it turns out that by applying Boolean operators and a similarity function, many Natural Language Processing operations can be implemented in a very elegant and efficient way.
Douglas R. Hofstadter’s Analogy as the Core of Cognition is a rich source for theoretical background on mental computation by analogy. In order to allow the brain to make sense of the world by identifying and applying analogies, all input data must be presented to the neo-cortex as a representation that is suited for the application of a distance measure.
The two faculties - making analogies and making predictions based on previous experiences - seem to be essential and could even be sufficient for the emergence of human-like intelligence.Read more
Cortical.io, an innovator in natural language processing (NLP), announces its next venture capital round. In this third
round, Cortical.io opens its capital to a new investor from the US, a fund affiliated with Open Field Capital (OFC), an
investment manager with a focus on emerging technology markets. After Numenta, OFC is the second US-based investor taking
an ownership position in Cortical.io. Reventon (NL) confirms its interest in the machine intelligence start-up with an
additional participation, bringing the capital increase to a total of USD 1.8 million.
“Cortical.io’s approach of using similarity as a foundation for intelligence should enable a NLP technology that not
only outperforms legacy systems in traditional text processing, but also opens a new range of applications that were not
possible before, because it has the potential to eliminate constraints related to processing speed, data volume and the
diversity of natural languages. This is exactly the kind of market disruption potential that we seek in an investment”,
describes Marc Weiss, Principal at OFC.
Together with the third capital round, Cortical.io announces the opening of an office in the San Francisco Bay Area,
where its sales and business development activities will be based. “North America is a core market for intelligent text
analytics”, explains Francisco Webber, CEO and co-founder of Cortical.io. “There is a lot of value still hidden in Big
Text Data. While our technology can be applied to many different business cases, its algorithmic efficiency has triggered
strong interest from the financial industry. In the context of compliance monitoring, for example, the high precision and
recall scores help to substantially reduce the associated workload”, explains Webber before concluding: “Our solution could
help banks save billions in legal bills.”
Cortical.io, an innovator in natural language processing (NLP), announces the availability of its Retina engine
in the Microsoft Azure Market Place. Based on an advanced proprietary machine intelligence algorithm, the Cortical.io
Retina engine encompasses a wide range of highly efficient NLP tools for text filtering, classification, clustering or
searching, that work across languages and in real time. It enables direct semantic comparisons of text, making complex NLP
operations stunningly simple. Now the Cortical.io Retina engine is available to developers as a REST API on the
Microsoft Cloud Computing Platform.
“Our customers typically look for a flexible, highly scalable platform to integrate the Cortical.io Retina engine into
their own application”, explains Francisco Webber, co-founder of Cortical.io. “Whether they have developed the ultimate
news filter that must analyze terabytes of data in real time, or a tool that automatically categorizes product descriptions
independently of the language used, they can now handle any workload, anywhere in the world.”
“Because it is highly efficient, the Cortical.io Retina engine makes it possible to tame the Big Text Data challenges
in terms of volume and speed. We are excited to see what kind of intelligent applications for text analytics the Azure
community will come up with,” said Nicole Herskowitz, Senior Director of Product Marketing, Microsoft Azure.
The Cortical.io Retina engine offers a fundamentally new approach to handle Big Data originating from unstructured text
sources. It can be easily integrated to develop the next generation of business applications for social media monitoring,
enterprise search, information discovery or profile matching, with unmatched quality.
Numenta, Inc., a leader in machine intelligence,
and Cortical.io, an innovator in natural language processing (NLP), are pleased to announce a strategic
partnership to create a new computing approach to understanding text. As part of the strategic relationship,
Cortical.io has taken a broad general license to Numenta’s Hierarchical Temporal Memory (HTM) technology, and
Numenta has taken an ownership position in Cortical.io. The combination of Cortical.io’s Semantic Folding
technology and Numenta’s HTM technology enables a host of exciting applications that have challenged computer
scientists for decades, including sentiment analysis, automatic summarization, semantic search, and
conversational dialogue systems.
“Cortical.io’s Semantic Folding technology is a clever and elegant way to feed natural language into our HTM
technology”, said Jeff Hawkins, founder of Numenta. “Cortical.io takes advantage of the semantic encoding and
predictive modeling of HTM systems in a way that will lead to significant advances in natural language processing.”
“Natural language understanding is one of the central problems of artificial intelligence,” said Francisco
Webber, founder and CEO of Cortical.io. “We aim to build the next generation of NLP, Language Intelligence, and
in so doing, show the path to broadly applied machine intelligence.”
Building on their existing commercial product, the Retina API, Cortical.io will make the combined technologies
available through their industrial-grade cloud service for customers ranging from innovative startups to
The Austrian science start-up Cortical.io has just secured a next venture round of 1.25 million dollar of growth
capital from Reventon (NL). This will help to bring Cortical.io’s portfolio of language intelligence products to the
global app-builder and enterprise market.
Based on the breakthrough neuroscience theory of Jeff Hawkins, Cortical.io’s semantic fingerprinting technology
represents language like in the human brain.
Cortical.io’s Retina API allows to create semantic fingerprints of any piece of text in any language. Fingerprints of
product descriptions in English can be compared to LinkedIn profiles in German, documents in Spanish compared to reading
preferences of French readers, multi-language twitter messages filtered by their content and job profiles related to CVs.
With Cortical.io’s Retina everything that can be described in words can be intelligently matched based on its meaning.Read more
We have developed a technology that enables developers to perform natural language processing in an intuitive and precise
manner. “Our Semantic Fingerprinting method enables the creation of a unique semantic fingerprint for any word, any
document, and in the near future even for any entity that can be described with natural language”, explains Francisco
Webber, co-founder of Cortical.io. The big difference to conventional semantic systems is that the conversion of words
into their semantic fingerprints is automated. There is no need for costly, time-consuming manual intervention anymore.
The core component of Cortical.io, the Retina, learns about the
essence of any language by reading text material about the world and is capable of semantically interpreting and computing
any textual content. It encodes words in the same way as information is fed into the brain and generates semantic
fingerprints of words and documents using a fine-grained representation of 16,000 semantic features for every term.
The invention of Cortical.io’s Retina could revolutionize the search
and analysis of text-based information, not only because of its transparency and simplicity of use, but also because of its
small footprint: huge amounts of text -structured and unstructured- can be processed with moderate computational power.
By converting any piece of text into a semantic fingerprint, tasks such as similarity comparison, contextual keyword
generation, sense disambiguation, and document classification are made simple. Cortical.io’s Semantic Fingerprinting method can be applied to messages, news, web content, document
collections and even real-time text streams from social networks.
The Cortical.io service is accessible through a REST API. Currently,
With the new API release, you can easily select different Retinas and get the most adequate results, whether you want to focus on context-similarity or synonym-similarity. If your goal is to disambiguate terms, i.e. identify which meanings are contained within a specific term, you will want to use an associatively focused Retina (“en_associative” in our API). If you prefer identifying synonymous items for terms or texts, then a synonymously focused Retina (“en_synonymous” in our API) will deliver better results.
More details about the two retinas are available in our FAQs.Read more