Natural Language Understanding or NLU is a subtopic within NLP that should focus on deriving the meaning of a sentence in a conversation with a chatbot, determine the context of the conversation, predict the intent of the user and obtain word level metadata.
NLU should employ reading comprehension techniques to process text data to extract meaning. This can help solve NLP tasks such as question answering, text classification, text summarisation etc.
NLU is an AI hard problem and is the precursor to obtaining true machine intelligence.
This article talks about the problems of developing NLU in chatbots and has comments by many industry and academia leaders on this topic.
NLU APIs & Open Source Tools
Most NLU APIs and open source tools and algorithms can derive POS tags, extract entities, and determine intent of an user message. However these may not be very accurate for domain specific use cases. Most of the models are trained with publicly available data like Wikipedia, Common Crawl etc. Hence, they lack the ability to predict tags and entities for domain specific words. Intent classification will again depend on the training dataset created for the specific domain and quality of prediction will depend on quality of trained data. This has to be an ongoing work as and when we encounter new cases.
The current set of models lack generating an understanding of the meaning of the user’s message. This could be an aggregate level understanding of an user message as a human would understand.
Let’s take an example
Say user asks a question: ‘What is cloud computing?’ For an open domain chatbot e.g. Alexa, this would mean retrieving the definition of this topic. Whereas for a MOOC chatbot this could mean the user is asking if this course is provided by the MOOC platform along with the course details.
In chatbot conversations, most users are vague about what they are looking for and would need to be guided. Mostly, with a series of questions. Only after this series of conversations has taken place can we accurately determine intent, context and derive meaning.
Sometimes, users converse as they would do with a human and add meaningless phrases. E.g- ‘What is cloud computing? I wonder whether I should take it up.’
Here ‘I wonder whether I should take it up’ is an extra sentence that can be ignored while determining intent and meaning but could be relevant for context if later on the chatbot would be asked to suggest a course to the user.
Training the chatbot
How do we train the chatbot to ignore this sentence while trying to find the answer to the question?
For the same example we can end up with many variations and it will be tall order to keep training the bot incrementally to handle each such variation.
If, however, we focus on determining the ‘relevant meaning(s)’ and train it, we will be able to mimic how a human would handle it.
Many experts suggest that we should use Deep Learning models to handle such cases. This is because the model will learn which variations map to which meaning and predict which meaning to use in which context. The biggest challenge here is Deep Learning needs tons of training data to get to respectable level of accuracy and creating such training data manually is very costly.
This is especially true for domain specific chatbots where data is hard to come by.
Is deep learning the answer?
Deep Learning could be a long term solution. We are generating data over a period of time and learning how users interact with the chatbot. In the short term with less data it will not be a feasible option.
Advances are being made in Deep learning NLP to get to generalised models. And these models would work well for most use cases. ElMo and BERT and two such promising models released in 2018 that help in determining contextual meaning of words as opposed to the context free meanings provided by Word2Vec, Glove, Spacy which are mostly in use today.
As we test these models for various use cases we will know if they have solved the problem of determining the ‘true’ meaning of a user message. For now it appears that meaning is the missing piece in the NLU puzzle.
Analytics and Machine learning