NLP in the Real World

For the most part, we have gone through the main parts of the NLP workflow. Next, let’s take a look at the powerful applications of this technology.

Use Case: Improving Sales

Roy Raanani, who has a career in working with tech startups, thought that the countless conversions that occur every day in business are mostly ignored. Perhaps AI could transform this into an opportunity?

In 2015, he founded Chorus to use NLP to divine insights from conversations from sales people. Raanani called this the Conversation Cloud, which records, organizes, and transcribes calls—which are entered in a CRM (Customer Relationship Management) system. Over time, the algorithms will start to learn about best practices and indicate how things can be improved.

But pulling this off has not been easy. According to a Chorus blog:

  • There are billions of ways to ask questions, raise objections, set action items, challenge hypotheses, etc. all of which need to be identified if sales patterns are to be codified. Second, signals and patterns evolve: new competitors, product names and features, and industry-related terminology change over time, and machine-learned models quickly become obsolete.7

For example, one of the difficulties—which can be easily overlooked—is how to identify the parties who are talking (there are often more than three on a call). Known as “speaker separation,” it is considered even more difficult than speech recognition. Chorus has created a deep learning model that essentially creates a “voice fingerprint”—which is based on clustering—for each speaker. So after several years of R&D, the company was able to develop a system that could analyze large amounts of conversations.

As a testament to this, look at one of Chorus’ customers, Housecall Pro, which is a startup that sells mobile technologies for field service management. Before adopting the software, the company would often create personalized sales pitches for each lead. But unfortunately, it was unscalable and had mixed results.

But with Chorus, the company was able to create an approach that did not have much variation. The software made it possible to measure every word and the impact on the sales conversions. Chorus also measured whether a sales rep was on-script or not.

The outcome? The company was able to increase the win rate of the sales organization by 10%.8

Use Case: Fighting Depression

Across the world, about 300 million people suffer from depression, according to data from the World Health Organization.9 About 15% of adults will experience some type of depression during their life.

This may go undiagnosed because of lack of healthcare services, which can mean that a person’s situation could get much worse. Unfortunately, the depression can lead to other problems.

But NLP may be able to improve the situation. A recent study from Stanford used a machine learning model that processed 3D facial expressions and the spoken language. As a result, the system was able to diagnose depression with an average error rate of 3.67 when using the Patient Health Questionnaire (PHQ) scale. The accuracy was even higher for more aggravated forms of depression.

In the study, the researchers noted: “This technology could be deployed to cell phones worldwide and facilitate low-cost universal access to mental health care.”10

Use Case: Content Creation

In 2015, several tech veterans like Elon Musk, Peter Thiel, Reid Hoffman, and Sam Altman launched OpenAI, with the support of a whopping $1 billion in funding. Structured as a nonprofit, the goal was to develop an organization with the goal “to advance digital intelligence in the way that is most likely to benefit humanity as a whole, unconstrained by a need to generate financial return.”11

One of the areas of research has been on NLP. To this end, the company launched a model called GPT-2 in 2019, which was based on a dataset of roughly eight million web pages. The focus was to create a system that could predict the next word based on a group of text.

To illustrate this, OpenAI provided an experiment with the following text as the input: “In a shocking finding, scientist discovered a herd of unicorns living in a remote, previously unexplored valley, in the Andes Mountains. Even more surprising to the researchers was the fact that the unicorns spoke perfect English.”

From this, the algorithms created a convincing story that was 377 words in length!

Granted, the researchers admitted that the storytelling was better for topics that related more to the underlying data, on topics like Lord of the Rings and even Brexit. As should be no surprise, GPT-2 demonstrated poor performance for technical domains.

But the model was able to score high on several well-known evaluations of reading comprehension. See Table 6-1.12

Table 6-1.

Reading comprehension results

DataSetPrior Record for AccuracyGPT-2’s Accuracy
Winograd Schema Challenge63.7%70.70%
LAMBADA59.23%63.24%
Children’s Book Test Common Nouns85.7%93.30%
Children’s Book Test Named Entities82.3%89.05%

Even though a typical human would score 90%+ on these tests, the performance of GPT-2 is still impressive. It’s important to note that the model used Google’s neural network innovation, called a Transformer, and unsupervised learning.

In keeping with OpenAI’s mission, the organization decided not to release the complete model. The fear was that it could lead to adverse consequences like fake news, spoofed Amazon.com reviews, spam, and phishing scams.

Use Case: Body Language

Just focusing on language itself can be limiting. Body language is also something that should be included in a sophisticated AI model.

This is something that Rana el Kaliouby has been thinking about for some time. While growing up in Egypt, she earned her master’s degree in science from the American University in Cairo and then went on to get her PhD in computer science at Newnham College of the University of Cambridge. But there was something that was very compelling to her: How can computers detect human emotions?

However, in her academic circles, there was little interest. The consensus view in the computer science community was that this topic was really not useful.

But Rana was undeterred and teamed up with noted professor Rosalind Picard to create innovative machine learning models (she wrote a pivotal book, called Affective Computing, which looked at emotions and machines).13 Yet there also had to be the use of other domains like neuroscience and psychology. A big part of this was leveraging the pioneering work of Paul Ekman, who did extensive research on human emotions based on a person’s facial muscles. He found that there were six universal human emotions (wrath, grossness, scaredness, joy, loneliness, and shock) that could be coded by 46 movements called action units—all becoming a part of the Facial Action Coding System, or FACS.

While at the MIT Media Lab, Rana developed an “emotional hearing aid,” which was a wearable that allowed those people with autism to better interact in social environments.14 The system would detect the emotions of people and provide appropriate ways to react.

It was groundbreaking as the New York Times named it as one of the most consequential innovations in 2006. But Rana’s system also caught the attention of Madison Avenue. Simply put, the technology could be an effective tool to gauge an audience’s mood about a television commercial.

Then a couple years later, Rana launched Affectiva. The company quickly grew and attracted substantial amounts of venture capital (in all, it has raised $54.2 million).

Rana, who was once ignored, had now become one of the leaders in a trend called “emotion-tracking AI.”

The flagship product for Affectiva is Affdex, which is a cloud-based platform for testing audiences for video. About a quarter of the Fortune Global 500 use it.

But the company has developed another product, called Affectiva Automotive AI, which is an in-cabin sensing system for a vehicle. Some of the capabilities include the following:

  • Monitoring for driver fatigue or distraction, which will trigger an alert (say a vibration of the seat belt).
  • Providing for a handoff to a semi-autonomous system if the driver is not waking or is angry. There is even an ability to provide route alternatives to lessen the potential for road rage!
  • Personalizing the content—say music—based on the passenger’s emotions.

For all of these offerings, there are advanced deep learning systems that process enormous amounts of features of a database that has more than 7.5 million faces. These models also account for cultural influences and demographic differences—which is all done in real-time.


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *