The most common thing in today’s world is that you will be able to hear a different introduction to the technical field on daily basis. Most of this technology is surrounding data only. It is such a rich field that makes it difficult to leave it without automation for even a day. One such relevant addition is Machine Learning.
So How is Machine Learning connected to a Chatbot?
Before getting to the answer, let’s understand the concept of Chatbot. We all have heard about Cortana, Alexa, and Siri. Some of us might have even used it to set an alarm, make a call and even sending a text message. Well, they are a part of a great technology introduction for us.
People are using it for their daily routine tasks without actually searching anything or performing a manual task. Just a command and your work are done. Despite this, it is very difficult to force the word out of these chatty agents about the communal or logical topics.
Well, here comes the use of machine learning. You can build a neural conversational chatbot with the help of machine learning.
Artificial intelligence requires to interact with the machine in a natural language that could be easily understood and interpreted with the system. However, this field is named as dialogue system, chatbots or even spoken dialogue systems. The main aim is to provide an enlightening answer, upholding a framework of conversation and be inarticulate from the mortal (idyllically and it is still in the development process).
Well, why will human say no to talk to someone who is supportive, amusing and fascinating speakers even though they are robots? Well, there are mainly two types of Chatbot available in the market currently, general conversation (Microsoft Tay bot) and goal-oriented (Cortana, Alexa, and Siri).
The goal-oriented is developed in order to solve the problems of the people by using natural language whereas the general conversation is used to talk with people on a large platform with several of topics. The dialogue system actually depends on the neural networks with many advanced ways. Through this post, you will get a hang on how machine learning is implemented with the Chatbot model creation.
1. Procreative and Selective Models
The conversational model is bifurcated into two different types that are Selective and Generative models. However, in some cases, the hybrid model is also used for the Chatbot development. The main thing to understand is that these systems are incorporated into the common question people might ask and their answers. It is basically a conceive way of storing dialogue context and simply predicting the answers in the respective situation.
Basically, the words implanted are passed through the network rather than the Ids which is usually portrayed as a sequence of words consumed by the network or the words are conceded to the RNN.
2. Dialogue Data Depiction
Now you might be confused on how the data is actually represented and what sort of dataset is used. However, it is actually simple, a set of data is formed that contains context and reply. The context is any situation or a series of question that requires a reply immediately by the system. However, the questions are a structure of tokens that are used form its language.
It is basically a sequence of a common conversation held between two people. A batch is formed that shows the sentences and their replies such as:
-Hi! How are you? -Hi! I am good. And you? -I am good too. What is your age? -I am twenty years old. And you? -Me too!
The token <eos> is used at the end of each sentence that means “end of sequence”. This is used to make it easy for the neural system to understand the sentences restrictions and helps in appraising the interior state cleverly. However, it is possible that Metadata is used for some information in the models such as emotions, gender, speaker id, etc.
Since you have knowledge about the basic then you can move on to a Generative model that is also known as Neural Conversational Model.
The dialogue is delivered in the form of sequence-to-sequence framework that is easily done with the help of machine translation filed that helps in the easy adaptation of problems. This model includes two different RNNs with completely diverse sets of parameters. One set is known as encoders and another one is known as decoders.
The main work of encoders is to comprehend the context sequence tokens one by one and update the unknown state. Once the whole process is done then the hidden state is produced that helps in the easy incorporation of the context to generate answers.
However, the decoder works are to mainly take the representation of the context passed by the encoder and then generate the appropriate answer. It contains a layer of softmax present with the vocabulary that helps in maintaining RNN. This works by taking in the hidden state and giving out the distributive answer. Here are steps in which the main generation takes place.
Adjusting decoder hidden state with the encoder final hidden state (h_0)
Then pass <eos> token in the form of first input to the decoder that will update the state to (h_1)
Sample first word (w_1) in the softmax layer by using (h_1)
Then pass the input and apprise the hidden state with (h_1->h_2) then generate a new word that will be shown as (w_2)
Keep repeating the fourth step until you obtain <eos> or maximum answer length.
However, this method is a part of the model interface but there is a different way like model training part that is used to decide on each and every step. All you need to use is correct word i.e., y _t. The main focus is on getting the correct next word each time. You can even predict the next word by providing a prefix.
Nevertheless, nothing is perfect even in the technology world. The software has some limitation that makes human to keep upgrading and modifying them. The main problems in the generative models are as mentioned below:
It gives a generic response. For instance, the probability of getting “yes”, “no”, “Okay”, etc. at a wide platform is possible. However, there are people that are working to modify this. The main problem to deal with is to change the objective function of the model interface. Another thing is that it is important to introduce the artificial metric that will be used as a reward.
Another thing that can be observed is an inconsistent reply to the summarized perspective. The developers are working to upgrade the model with a more advanced encoding state that can help to generate an answer and that can easily do speaker embedding.
After you are done with the generative model now you need to understand what selective neural conversational model is and how does it work.
The selective model work on simple function such as reply, context, and word (w). There are basically two towers each distributive to “context” and “reply”. Both of them have different architecture as per your requirement. The working is simple, one tower takes in the input and the calculation is done in the vector space. The answer that sounds more perfect as per the context is the chosen answer.
This all works with the triplet loss i.e. (context, reply_correct, reply_wrong). The reply-wrong is just used for the negative sample which is just some reply in the maximum probability answer. Through this, you will obtain the ranking of function that is not much informative. The answer with the maximum score will be chosen as the main answer.
Generative model Vs Selective model: Which is Better?
Since you have obtained the general understanding of both the models, now you might be confused on which one to use in future. The answer is simple, it will depend on your requirement.
With the generative model, you can generate arbitrary answers and can check the grammar. Whereas with the selective model, you only access the restricted predefined answers. However, you don’t have to face the problem of general answers with the selective model.
click here to watch making of B-AIM: