Description: A chatbot is an AI program that simulates human conversation. By using advanced models like GPT-2 (Generative Pretrained Transformer 2), a chatbot can understand and generate human-like responses in a conversation. GPT-2 is a language model developed by OpenAI, which can generate coherent and context-aware text based on given input.It works by predicting the next word in a sentence, considering all previous words.
Step-by-Step Process
Setup and Installation: Install necessary libraries like transformers (to use GPT-2) and torch (for handling the models computations).
Load GPT-2 Model: GPT-2 is pre-trained on large datasets and available through the Hugging Face library. You load the model and tokenizer to process text inputs and generate responses.
Chat History: The chatbot stores previous conversation exchanges to provide context.This helps the bot generate more relevant responses instead of just replying to isolated inputs.
User Input and Model Response: When the user sends a message, the input is tokenized (converted into numbers) and passed to the GPT-2 model. The model then generates a response based on the input and any previous conversation.
Generate Response: The model generates the response, which is then decoded into text and sent back to the user.
Interactive Loop: The chatbot runs continuously, accepting new user inputs, generating responses,and maintaining conversation context until the user types.
Sample Code
import torch from transformers import GPT2LMHeadModel, GPT2Tokenizer
# Load pre-trained GPT-2 model and tokenizer
model_name = "gpt2" # You can also use "gpt2-medium", "gpt2-large", or "gpt2-xl" for larger models
model = GPT2LMHeadModel.from_pretrained(model_name)
tokenizer = GPT2Tokenizer.from_pretrained(model_name)
# Initialize chat history (empty at the start)
chat_history_ids = None
# Function to interact with the chatbot
def chat():
global chat_history_ids
print("Chatbot: Hello! Type 'exit' to end the conversation.")
while True:
# Take user input
user_input = input("You: ")
# Exit condition
if user_input.lower() == "exit":
print("Chatbot: Goodbye!")
break
# Encode the new user input with the EOS token and add it to the chat history
new_user_input_ids = tokenizer.encode(user_input + tokenizer.eos_token,
return_tensors='pt')
# If there is prior conversation, append the new input to the chat history
if chat_history_ids is not None:
input_ids = torch.cat([chat_history_ids, new_user_input_ids], dim=-1)
else:
input_ids = new_user_input_ids
# Generate attention mask
attention_mask = torch.ones(input_ids.shape, device=input_ids.device)
# Get the models response (output) given the input history
chat_history_ids = model.generate(input_ids, attention_mask=attention_mask,
max_length=100,pad_token_id=tokenizer.eos_token_id, temperature=0.7,top_p=0.9,
top_k=50, do_sample=True, no_repeat_ngram_size=3)
# Decode the model's response and print it
bot_response = tokenizer.decode(chat_history_ids[:, input_ids.shape[-1]:]
[0], skip_special_tokens=True)
print("Chatbot:", bot_response)
# Start the chatbot
if __name__ == "__main__":
chat()