Beware of insights! Real danger lurks behind the promise of big data to bring more data to more people faster, better, and cheaper: Insights are only as good as how people interpret the information presented to them. When looking at a stock chart, you can’t even answer the simplest question — “Is the latest stock price move good or bad for my portfolio?” — without understanding the context: where you are in your investment journey and whether you’re looking to buy or sell. While structured data can provide some context — like checkboxes indicating your income range, investment experience, investment objectives, and risk tolerance levels — unstructured data sources contain several orders of magnitude more context. An email exchange with a financial advisor indicating your experience with a particular investment vehicle, news articles about the market segment heavily represented in your portfolio, and social media posts about companies in which you’ve invested or plan to invest can all generate much broader and deeper context to better inform your decision to buy or sell.

But defining the context by finding structures, patterns, and meaning in unstructured data is not a simple process. As a result, firms face a gap between data and insights; while they are awash in an abundance of customer and marketing data, they struggle to convert this data into the insights needed to win, serve, and retain customers. In general, Forrester has found that:

  • The problem is not a lack of data. Most companies have access to plenty of customer feedback surveys, contact center records, mobile tracking data, loyalty program activities, and social media feeds — but, alas, it’s not easily available to business leaders to help them make decisions.
  • Insights derived from unstructured sources lag those based on structured data. While structured data management and BI processes like data integration, data warehousing, reporting, querying, analytics, and data visualization have matured over the past few decades, unstructured data management and analytics lag behind by as much as 30%. Our latest survey data shows that, on average, enterprises leverage about 35% of their structured data for insights and decision-making, but only 25% of their unstructured enterprise data

Consider the analogy that if data is digital, then text is analog. While interpretations of structured data are mostly binary, text has a nearly infinite number of obvious and hidden meanings and possible interpretations. Structured data can get you very close to identifying root causes of sales performance problems, but adding text-based data like social media product reviews or emails from salespeople describing clients’ office politics can significantly broaden and deepen your understanding of what’s really going on. Organizations need to deploy text analytics to bridge the information gap and finally start getting closer to the nirvana of 360-degree views of customers, products, suppliers, financials, risks, and logistics.

So how does one deploy and start reaping benefits from text analytics technology? There’s more to the text analytics process than meets the eye. To most business users text analytics is a black box where unstructured text goes in and keywords, sentiments, and other structured information magically come out. But when considering text analytics platforms, tech pros need to look into that black box and isolate the specific process steps and capabilities. Understanding the process workflow, the components in the workflow, and the specific functionality of each component is a key to demystifying text analytics and to mapping vendor capabilities to each specific use case. In Forrester latest research we demystify text analytics process by describing the following key components and technology capabilities such as:

  • Extracting, ingesting, digitizing, and preparing the text for mining
    • Connectivity to a broad spectrum of data sources.
    • Text ingestion and conversion.
    • Text preprocessing and preparation.
  • Maping your use cases to linguistic, statistical, trained, and unsupervised techniques
    • Text processing using linguistic rules.
    • Statistical text analysis.
    • Supervised and unsupervised techniques.
    • Advanced statistical analysis, aka cognitive computing or deep learning or the neural network-based artificial intelligence (AI)
  • Enrich the data and analyzing the findings
    • Post-processing and data enrichment with domain knowledge.
    • A UI for browsing, refining, and analysis.

Whew! Finally. Objective achieved. After taking the original raw text through a complex journey of dozens of steps and hundreds of rules, the text analytics process or application has extracted structures, meaning and sentiment from the text, which tech pros can help their business colleagues apply to hundreds of use cases. In addition to semantic search and knowledge management applications, the output of text analytics may involve a broad spectrum of use cases including a simple enrichment of customer data in an enterprise data warehouse (EDW), adding sentiment and brand analysis to marketing applications, converting text to query in the latest generation of the business intelligence (BI) user interfaces (UI), and inferring logical consequences from text-based statements by using advanced applications such as semantic inference or reasoning engines. For more details I invite you to read our latest detailed research Vendor Landscape: Big Data Text Analytics