Question Answering Setup on Ubuntu

1. Install Python and Required Tools

First, ensure that Python is installed on your system. Use the following commands to install Python, pip, and venv for managing virtual environments:

sudo apt update
sudo apt install python3 python3-pip python3-venv

    

2. Create a Virtual Environment

It is best practice to use a virtual environment to manage dependencies. Run the following commands to create and activate a virtual environment:

python3 -m venv qa-env
source qa-env/bin/activate

    

3. Install Necessary Libraries

For implementing Question Answering (QA), we will use Hugging Face’s transformers library along with PyTorch. Install the required libraries:

pip install transformers torch datasets

    

4. Load a Pre-trained Question Answering Model

Hugging Face provides several pre-trained models for question answering, such as BERT, RoBERTa, or DistilBERT. Let's load a BERT model fine-tuned for QA tasks:

from transformers import AutoTokenizer, AutoModelForQuestionAnswering

# Load the tokenizer and model
tokenizer = AutoTokenizer.from_pretrained("bert-large-uncased-whole-word-masking-finetuned-squad")
model = AutoModelForQuestionAnswering.from_pretrained("bert-large-uncased-whole-word-masking-finetuned-squad")

    

5. Prepare Your Context and Question

You need a context (the passage of text where the model will search for the answer) and a question. Tokenize the context and the question before passing it to the model:

# Define the context and the question
context = "Hugging Face Inc. is a company based in New York City. Its mission is to democratize artificial intelligence."
question = "Where is Hugging Face based?"

# Tokenize the input
inputs = tokenizer(question, context, return_tensors="pt")

    

6. Run Inference to Get the Answer

After tokenizing the input, you can feed it into the model and get the start and end positions of the answer in the context:

# Get model output
outputs = model(**inputs)

# Get the most likely beginning and end of the answer
answer_start = outputs.start_logits.argmax()
answer_end = outputs.end_logits.argmax()

# Decode the tokens to get the answer
answer = tokenizer.convert_tokens_to_string(tokenizer.convert_ids_to_tokens(inputs["input_ids"][0][answer_start:answer_end + 1]))
print(f"Answer: {answer}")

    

7. Fine-tune the Model (Optional)

If you have a custom dataset and want to fine-tune the model for your QA task, you can use Hugging Face’s Trainer API. Load your dataset using the datasets library, and set up the training process:

from datasets import load_dataset
from transformers import TrainingArguments, Trainer

# Load your dataset
dataset = load_dataset("path_to_your_dataset")

# Define training arguments
training_args = TrainingArguments(
    output_dir="./results",
    evaluation_strategy="epoch",
    learning_rate=2e-5,
    per_device_train_batch_size=16,
    per_device_eval_batch_size=16,
    num_train_epochs=3,
    weight_decay=0.01,
)

# Initialize the Trainer
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=dataset["train"],
    eval_dataset=dataset["validation"],
    tokenizer=tokenizer
)

# Fine-tune the model
trainer.train()

    

8. Save the Fine-tuned Model

Once the fine-tuning process is complete, you can save the model for later use:

# Save the fine-tuned model
trainer.save_model("./qa_model")

    

9. Run Inference on New Data

After fine-tuning or loading a pre-trained model, you can run inference on new contexts and questions:

# Define a new context and question
new_context = "Albert Einstein was a German-born physicist who developed the theory of relativity."
new_question = "Who developed the theory of relativity?"

# Tokenize the new input
new_inputs = tokenizer(new_question, new_context, return_tensors="pt")

# Get model output
new_outputs = model(**new_inputs)

# Get the answer
new_answer_start = new_outputs.start_logits.argmax()
new_answer_end = new_outputs.end_logits.argmax()
new_answer = tokenizer.convert_tokens_to_string(tokenizer.convert_ids_to_tokens(new_inputs["input_ids"][0][new_answer_start:new_answer_end + 1]))

print(f"Answer: {new_answer}")

    

10. Serve the Model Using FastAPI (Optional)

If you want to serve the QA model as an API, you can use FastAPI and Uvicorn. Install the required libraries:

pip install fastapi uvicorn

    

Here is a basic FastAPI app that serves the QA model:

from fastapi import FastAPI
from transformers import AutoTokenizer, AutoModelForQuestionAnswering

app = FastAPI()

# Load the tokenizer and model
tokenizer = AutoTokenizer.from_pretrained("bert-large-uncased-whole-word-masking-finetuned-squad")
model = AutoModelForQuestionAnswering.from_pretrained("bert-large-uncased-whole-word-masking-finetuned-squad")

@app.post("/qa")
def qa(context: str, question: str):
    # Tokenize the inputs
    inputs = tokenizer(question, context, return_tensors="pt")

    # Get model output
    outputs = model(**inputs)

    # Get the answer
    answer_start = outputs.start_logits.argmax()
    answer_end = outputs.end_logits.argmax()
    answer = tokenizer.convert_tokens_to_string(tokenizer.convert_ids_to_tokens(inputs["input_ids"][0][answer_start:answer_end + 1]))
    
    return {"answer": answer}

# Run the server
if __name__ == "__main__":
    import uvicorn
    uvicorn.run(app, host="0.0.0.0", port=8000)

    

You can run this API with the following command:

uvicorn app:app --reload

    

Summary

You have successfully set up a Question Answering system on Ubuntu using Python, Hugging Face's transformers library, and PyTorch. You can use a pre-trained model or fine-tune your own, run inference on new data, and optionally serve the model through an API using FastAPI.