Enterprises typically have countless data buckets to wrangle (upwards of 93% say they’re storing data in more than one place), and some of those buckets invariably become underused or forgotten. A Forrester survey found that between 60% and 73% of all data within corporations is never analyzed for insights or larger trends, while a separate Veritas report found that 52% of all information stored by organizations is of unknown value. The opportunity cost of this unused data is substantial — the Veritas report pegs it as a cumulative $3.3 trillion by the year 2020, if the current trend holds.
That’s perhaps why this year saw renewed interest from the corporate sector in AI-powered software-as-a-service (SaaS) products that ingest, understand, organize, and query digital content from multiple sources. “Keyword-based enterprise search engines of the past are obsolete. Cognitive search is the new generation of enterprise search that uses [AI] to return results that are more relevant to the user or embedded in an application issuing the search query,” wrote Forrester analysts Mike Gualtieri, Srividya Sridharan, and Emily Miller in a comprehensive survey of the industry published in 2017.
Microsoft kicked the segment into overdrive in early November by launching Project Cortex, a service that taps AI to automatically classify and analyze an organization’s documents, conversations, meetings, and videos. It’s in some ways a direct response to Google Cloud Search, which launched July 2018. Like Project Cortex, Cloud Search pulls in data from a range of third-party products and services running both on-premises and in the cloud, relying on machine learning to deliver query suggestions and surface the most relevant results. Not to be outdone, Amazon last week unveiled AWS Kendra, which taps a library of connectors to unify data sources, including file systems, websites, Box, DropBox, Salesforce, SharePoint, relational databases, and more.
Of course, Google, Amazon, and Microsoft aren’t the only cognitive search vendors on the block. There’s IBM, which offers a data indexing and query processing service dubbed Watson Explorer, and Coveo, which uses AI to learn users’ behaviors and return results that are most relevant to them. Hewlett-Packard Enterprise’s IDOL platform supports analytics for speech, images, and video, in addition to unstructured text. And both Lucidworks and Squirro leverage open source projects like Apache Solr and Elasticsearch to make sense of disparate data sets.
The cognitive search market is exploding — it’s anticipated to be worth $15.28 billion by 2023, up from $2.59 billion in 2018, according to Markets and Markets — and it coincides with an upswing in the adoption of AI and machine learning in the enterprise. But it’s perhaps more directly attributable to the wealth of telemetry afforded by modern corporate digital environments.
AI under the hood
AI models like those at the heart of AWS Kendra, Project Cortex, and Cloud Search learn from signals, or behavioral data derived from various inputs. These come from the web pages that employees visit or the videos they watch online, or their online chats with support agents and public databases of support tickets. That’s not to mention detailed information about users, including job titles, locations, departments, coworkers, and potentially all of the documents, emails, and other correspondences they author.
Each signal informs an AI system’s decision-making such that it self-improves practically continuously, automatically learning how various resources are relevant to each person and ranking those resources accordingly. Plus, because enterprises have far fewer data sources to contend with than, say, a web search engine, the models are less expensive and computationally time-consuming to train.
The other piece of the puzzle is natural language processing (NLP), which enables platforms like AWS Kendra to understand not only the document minutiae, but the search queries that employees across an organization might pose — like “How do I invest in our company’s 401k?” versus “What are the best options for my 401k plan?”
Not every platform is equally capable in this regard, but most incorporate emerging techniques in NLP, as well as the adjacent field of natural language search (NLS). NLS is a specialized application of AI and statistical reasoning that creates a “word mesh” from free-flowing text, akin to a knowledge graph, to connect similar concepts that are related to larger ideas. NLS systems understand context in this way, meaning they’ll return the same answer regardless of how a query is phrased and will take users to the exact spot in a record where that answer is likely to be found.
Cognitive search: the new normal
In short order, cognitive search stands to become table stakes in the enterprise. It’s estimated that 54% of knowledge workers are already interrupted a few times or more per month when trying to get access to answers, insights, and information. And the volume of unstructured data organizations produce is projected to increase in the years to come, exacerbating the findability problem.
“Productivity isn’t just about being more efficient. It’s also about aggregating and applying the collective knowledge of your organization so that together you can achieve more,” wrote Microsoft 365 corporate vice president Jared Spataro in a recent blog post. “[Cognitive search systems enable] business process efficiency by turning your content into an interactive knowledge repository … to analyze documents and extract metadata to create sophisticated content models … [and to] make it easy for people to access the valuable knowledge that’s so often locked away in documents, conversations, meetings, and videos.”