Question Answering Setup on Ubuntu
1. Install Python and Required Tools
First, ensure that Python is installed on your system. Use the following commands to install Python, pip
, and venv
for managing virtual environments:
sudo apt update
sudo apt install python3 python3-pip python3-venv
2. Create a Virtual Environment
It is best practice to use a virtual environment to manage dependencies. Run the following commands to create and activate a virtual environment:
python3 -m venv qa-env
source qa-env/bin/activate
3. Install Necessary Libraries
For implementing Question Answering (QA), we will use Hugging Face’s transformers
library along with PyTorch. Install the required libraries:
pip install transformers torch datasets
4. Load a Pre-trained Question Answering Model
Hugging Face provides several pre-trained models for question answering, such as BERT, RoBERTa, or DistilBERT. Let's load a BERT model fine-tuned for QA tasks:
from transformers import AutoTokenizer, AutoModelForQuestionAnswering
# Load the tokenizer and model
tokenizer = AutoTokenizer.from_pretrained("bert-large-uncased-whole-word-masking-finetuned-squad")
model = AutoModelForQuestionAnswering.from_pretrained("bert-large-uncased-whole-word-masking-finetuned-squad")
5. Prepare Your Context and Question
You need a context (the passage of text where the model will search for the answer) and a question. Tokenize the context and the question before passing it to the model:
# Define the context and the question
context = "Hugging Face Inc. is a company based in New York City. Its mission is to democratize artificial intelligence."
question = "Where is Hugging Face based?"
# Tokenize the input
inputs = tokenizer(question, context, return_tensors="pt")
6. Run Inference to Get the Answer
After tokenizing the input, you can feed it into the model and get the start and end positions of the answer in the context:
# Get model output
outputs = model(**inputs)
# Get the most likely beginning and end of the answer
answer_start = outputs.start_logits.argmax()
answer_end = outputs.end_logits.argmax()
# Decode the tokens to get the answer
answer = tokenizer.convert_tokens_to_string(tokenizer.convert_ids_to_tokens(inputs["input_ids"][0][answer_start:answer_end + 1]))
print(f"Answer: {answer}")
7. Fine-tune the Model (Optional)
If you have a custom dataset and want to fine-tune the model for your QA task, you can use Hugging Face’s Trainer
API. Load your dataset using the datasets
library, and set up the training process:
from datasets import load_dataset
from transformers import TrainingArguments, Trainer
# Load your dataset
dataset = load_dataset("path_to_your_dataset")
# Define training arguments
training_args = TrainingArguments(
output_dir="./results",
evaluation_strategy="epoch",
learning_rate=2e-5,
per_device_train_batch_size=16,
per_device_eval_batch_size=16,
num_train_epochs=3,
weight_decay=0.01,
)
# Initialize the Trainer
trainer = Trainer(
model=model,
args=training_args,
train_dataset=dataset["train"],
eval_dataset=dataset["validation"],
tokenizer=tokenizer
)
# Fine-tune the model
trainer.train()
8. Save the Fine-tuned Model
Once the fine-tuning process is complete, you can save the model for later use:
# Save the fine-tuned model
trainer.save_model("./qa_model")
9. Run Inference on New Data
After fine-tuning or loading a pre-trained model, you can run inference on new contexts and questions:
# Define a new context and question
new_context = "Albert Einstein was a German-born physicist who developed the theory of relativity."
new_question = "Who developed the theory of relativity?"
# Tokenize the new input
new_inputs = tokenizer(new_question, new_context, return_tensors="pt")
# Get model output
new_outputs = model(**new_inputs)
# Get the answer
new_answer_start = new_outputs.start_logits.argmax()
new_answer_end = new_outputs.end_logits.argmax()
new_answer = tokenizer.convert_tokens_to_string(tokenizer.convert_ids_to_tokens(new_inputs["input_ids"][0][new_answer_start:new_answer_end + 1]))
print(f"Answer: {new_answer}")
10. Serve the Model Using FastAPI (Optional)
If you want to serve the QA model as an API, you can use FastAPI and Uvicorn. Install the required libraries:
pip install fastapi uvicorn
Here is a basic FastAPI app that serves the QA model:
from fastapi import FastAPI
from transformers import AutoTokenizer, AutoModelForQuestionAnswering
app = FastAPI()
# Load the tokenizer and model
tokenizer = AutoTokenizer.from_pretrained("bert-large-uncased-whole-word-masking-finetuned-squad")
model = AutoModelForQuestionAnswering.from_pretrained("bert-large-uncased-whole-word-masking-finetuned-squad")
@app.post("/qa")
def qa(context: str, question: str):
# Tokenize the inputs
inputs = tokenizer(question, context, return_tensors="pt")
# Get model output
outputs = model(**inputs)
# Get the answer
answer_start = outputs.start_logits.argmax()
answer_end = outputs.end_logits.argmax()
answer = tokenizer.convert_tokens_to_string(tokenizer.convert_ids_to_tokens(inputs["input_ids"][0][answer_start:answer_end + 1]))
return {"answer": answer}
# Run the server
if __name__ == "__main__":
import uvicorn
uvicorn.run(app, host="0.0.0.0", port=8000)
You can run this API with the following command:
uvicorn app:app --reload
Summary
You have successfully set up a Question Answering system on Ubuntu using Python, Hugging Face's transformers
library, and PyTorch. You can use a pre-trained model or fine-tune your own, run inference on new data, and optionally serve the model through an API using FastAPI.