Natural Language Processing (NLP)

Definition

What is natural language processing (NLP)?

Natural language processing (NLP), as the name suggests, is a branch of computer science—particularly artificial intelligence (AI)—that is focused on equipping computers with the ability to process and make sense of both written and spoken words like an actual human being.

If you want a more in-depth understanding of this topic, check out the FAQ section below:

Question #1: What are the biggest challenges in natural language processing (NLP)?

The biggest challenge in natural language processing (NLP) is the many ambiguities in the human language. Not everything that we write or say means exactly what it seems to mean.

 Context plays a big role in the conversation. In addition, there is almost never just one way to say or write something, thanks to things such as idioms, sarcasm, homophones, homonyms, metaphor, grammar and usage exceptions, and sentence structure variations.

For apps revolving around or driven by natural language to actually be useful, programmers need to figure out a way to teach computers to accurately recognise and make sense of all this.

For more challenges in NLP, check this helpful blog from Rosoka.

Question #2: How does it work?

Natural language processing (NLP) works using a combination of several techniques, including:

  1. Speech recognition – Also known as speech to text, this involves making sense of what users say despite things like slurred speech, incorrect grammar, different accents, varying intonation, and different speaking speeds.
  2. Grammatical tagging – This involves figuring out what part of speech a word is based on context and how it is used. For example, in the sentence ‘I replaced the broken pedal on my bike’, the word ‘pedal’ acts as a noun. In the sentence ‘it’s time to pedal as hard as you can’, however, it acts as a verb.
  3. Word sense disambiguation – This involves determining what a word or phrase means based on context and how it is used. For example, the word ‘right’ has a vastly different meaning in ‘turn right here’ than in ‘you’re right’.
  4. Name entity recognition – This involves tagging words and phrases as useful entities. For example, ‘Brisbane’ should be tagged as a location while ‘Thomas’ should be tagged as a name.
  5. Co–reference resolution This involves recognising when two or more words or phrases refer to a single entity. For example, in a single paragraph, the words ‘The Beatles’, ‘the band’, and ‘they’ can all refer to the exact same thing.

This technique is also used to determine whether or not a word or phrase is used as an idiom instead of literally. For example, the word ‘beast’ can mean a dominant athlete instead of an actual animal.

  1. Sentiment analysis This involves recognizing subjective qualities such as emotions, sarcasm, attitudes, suspicion, and confusion from a given piece of written text.
  2. Natural language generation – This is considered to be the opposite of speech recognition because it takes structured information and transforms it into human language.

Question #3: What is it used for?

Natural language processing (NLP) is used for purposes such as:

  1. Spam detection – By processing and analyzing the content of your incoming emails and chat messages, natural language processing (NLP) tools can identify spam and other forms of malicious messages before they reach you.
  2. Machine translation – The most popular example of this is Google Translate. While it is far from perfect, it is one of the best free tools available today.
  3. Chatbots and virtual agents – While not all chatbots use natural language processing (NLP), some do to provide users with more accurate and valuable responses based on the inputs they receive.

Virtual agents such as Siri and Alexa, on the other hand, are designed to make sense of user requests and respond accordingly.

  1. Social media sentiment analysis -This involves figuring out the general attitude of people towards a product, service, event, or idea based on the language they use in social media posts, comments, and reviews.
  2. Text summarisation – As the name suggests, this involves the processing and summarising of massive volumes of written text. More advanced tools are even smart enough to add context and conclusions based on the content they process.