Table Question Answering

What is Table Question Answering?

Table Question Answering (Table QA) is a subfield of natural language processing (NLP) that focuses on answering questions based on information contained in tables, such as those found in spreadsheets, databases, or web pages. Essentially, it involves extracting and providing precise answers from tabular data in response to natural language queries.

Table Question Answering
Figure 1 - Table Question Answering

Where can you find Table Question Answering models

This is the link to use to filter Hunggingface models for Table Question Answering:

https://huggingface.co/models?pipeline_tag=table-question-answering&sort=trending

Our favourite Model Authors:

The most interesting Table Question Answering project

One of the most interesting Table Question Answering projects is called TAPAS.

TAPAS large model fine-tuned on WikiTable Questions (WTQ)

TAPAS is a BERT-like transformers model pretrained on a large corpus of English data from Wikipedia in a self-supervised fashion. This means it was pretrained on the raw tables and associated texts only, with no humans labelling them in any way (which is why it can use lots of publicly available data) with an automatic process to generate inputs and labels from those texts. More precisely, it was pretrained with two objectives:

  • Masked language modeling (MLM): taking a (flattened) table and associated context, the model randomly masks 15% of the words in the input, then runs the entire (partially masked) sequence through the model. The model then has to predict the masked words. This is different from traditional recurrent neural networks (RNNs) that usually see the words one after the other, or from autoregressive models like GPT which internally mask the future tokens. It allows the model to learn a bidirectional representation of a table and associated text.
  • Intermediate pre-training: to encourage numerical reasoning on tables, the authors additionally pre-trained the model by creating a balanced dataset of millions of syntactically created training examples. Here, the model must predict (classify) whether a sentence is supported or refuted by the contents of a table. The training examples are created based on synthetic as well as counterfactual statements.

This way, the model learns an inner representation of the English language used in tables and associated texts, which can then be used to extract features useful for downstream tasks such as answering questions about a table, or determining whether a sentence is entailed or refuted by the contents of a table.

https://huggingface.co/navteca/tapas-large-finetuned-wtq

Examples

  • Given a table with names, ages, and cities, a user asks "Who lives in New York?" The system should be able to extract the relevant information from the table and provide an answer.
  • Given a table with financial data, a user asks "What was the total revenue for Q2 last year?" The system should be able to perform calculations and provide an accurate answer.
  • Given two tables with customer information and order history, a user asks "Which customers have placed orders in the past month?" The system should be able to join the two tables and provide an answer.

Applications of Table Question Answering

Business Intelligence

  • Data analysis: TQA can be used to analyze large datasets and provide insights to business stakeholders.
  • Decision-making: By providing accurate and timely answers to complex questions, TQA can support informed decision-making.

Data Analysis

  • Information retrieval: TQA can be used to retrieve specific information from large datasets.
  • Knowledge discovery: By analyzing tables and providing answers to user queries, TQA can facilitate knowledge discovery and exploration.

Education

  • Learning assistance: TQA can be used to assist students in understanding complex concepts by providing interactive and engaging learning experiences.
  • Assessment tools: TQA can be used to create assessment tools that evaluate student understanding of complex topics.

Challenges in Table Question Answering

Despite its potential applications, TQA faces several challenges, including:

  • Ambiguity: Tables can contain ambiguous information, making it difficult for TQA systems to provide accurate answers.
  • Noise: Tables can contain noisy data, which can affect the accuracy of TQA systems.
  • Scalability: As the size of tables increases, TQA systems must be able to scale to handle the increased complexity.

Future Directions in Table Question Answering

The future of TQA holds much promise, with several areas of research and development that show great potential, including:

  • Multimodal interaction: TQA systems that can interact with users through multiple modalities, such as voice, text, and visual interfaces.
  • Explainable AI: TQA systems that can provide explanations for their answers, enabling users to understand the reasoning behind the responses.
  • Edge computing: TQA systems that can operate on edge devices, reducing latency and improving performance.

Conclusion

Table question answering is a rapidly growing field that has significant potential applications in various domains. By understanding the definition, examples, and applications of TQA, readers can gain a deeper appreciation for the importance of this technology and its relevance to AI.

If you are ready to setup your first text classification system follow the instructions in our next page:

How to setup a Table Question Answering system

Image sources

Figure 1: https://production-media.paperswithcode.com/datasets/WikiTableQuestions-0000002557-a27d8922_j4dwyIS.jpg

More information