Lexicographers have a hard time these days. With the rise of social media, they are confronted with the appearance of new terms almost in real time. As they observe search trends in online dictionaries, they must react quickly. Some of these new terms are easy to define – bitcoin, blockchain, phishing: articles that deliver contextual information are surfacing across the net like mushrooms on the damp soil of a forest, helping lexicographers to capture the various semantic flavors into a set of definitions. But what about words coined by a pundit in the frenzy of a tweet, words that leave you wondering whether they are just a misspelling, or a genuine word creation? Should lexicographers ignore them, despite the ten thousands of lookups?
What makes a word worthy of being added to a dictionary?
Its frequency of use or the interest it generates in people? The lexical experts behind Dictionary.com lean towards the latter criterion.
In a business context, the fact that language evolves rapidly – new vocabulary appearing, meanings shifting, words becoming obsolete – constitutes a technical challenge that has significant economic implications.
Chief information officers want to provide their companies with a competitive edge by mining the mountains of data they have accumulated like squirrels before the winter. To this purpose, computers are patiently trained to recognize patterns and provide insights, at a huge cost. The problem is, the training material captures the state of language at a given instant. Soon after the system has been trained, it might well be that the material has become, at least partially, obsolete. The system will perform poorly until retrained.
In state-of-the-art machine learning approaches, batch learning techniques predominate. When new vocabulary is to be added, machines are retrained with completely new data sets. This is mainly because efforts to adapt to dynamic environments via online machine learning often suffer from a phenomenon called catastrophic forgetting: the system forgets everything it already learned when it is trained with new data. The Retina Engine needs only a few hours to incorporate new training material. But other machine learning approaches take days or even weeks to process the whole corpus again. Consequently, one of the big questions data scientists struggle with is:
When and how should my model be adapted?
Now, think about it for a second. When coming across new pieces of information, do you ask yourself whether you should assimilate this new knowledge now, later or not at all? I guess not. You probably just take it and move on. It is an intuitive process, a process that characterizes natural intelligence.
What if machines could learn in the same elegant, synthetic way as our brain?
What sounds like a dream might not be impossible, as we discovered in a recent project. Our client, a Fortune 100 company with support call centers all over the world, employs hundreds of agents, who receive thousands of support requests from customers every day. The agents’ main goal is to solve as quickly as possible each of the support requests to ensure highest customer satisfaction. They try to leverage the information contained in previous support cases to quickly solve the new requests, but their system often delivers inaccurate results: the search engine is confused by differences between terminology used by customers to describe problems and that used by engineers to describe solutions. It does not understand terms it has never seen before. The IT department makes a huge effort to update the system, retraining it on a regular basis, adapting it to new vocabulary, but the approach not only swallows millions of dollars, it is also of limited benefit: more new terms appear very soon after the system has been retrained.
When this company incorporated the Cortical.io Retina technology into its workflow, the engineers discovered a system that enables new vocabulary to be automatically identified even during runtime, without the whole model having to be retrained. This addition of new vocabulary is very simple, mirroring the way we, humans, assimilate terms. As documents are added or updated, the occurrence of each new term is recorded together with its surrounding contexts. When enough occurrences of a term have been recorded, the Retina Engine uses the associated contexts to numerically encode the meaning of the term. The encoding is called the term’s semantic fingerprint, and, since the fingerprint is created during run-time instead of during one of the periodic model-training sessions, it is known as a provisional semantic fingerprint. Provisional fingerprints can be used as good approximations of the meaning of new terms until the system is retrained.
There are many good reasons why this major multinational company decided to deploy Cortical.io’s Retina technology. The fact that the time needed by call center agents to solve a support request was reduced by nearly 70% is certainly not the least. But I’d say that the system ability to adapt smoothly to a changing environment, because it comes so near to true intelligence, most impressed them.
Our world, like our language, is constantly evolving. To add real value to our own intelligence, intelligent machines won’t be able to succeed without two fundamental attributes: versatility and adaptiveness.