X

active-qa

Information

# ActiveQA: Active Question Answering This repo contains code for our paper [Ask the Right Questions: Active Question Reformulation with Reinforcement Learning](https://openreview.net/forum?id=S1CChZ-CZ). Small forewarning, this is still much more of a research codebase than a library. No support is provided. *If you use this code for your research, please [cite the paper](#bibtex).* ## Introduction ActiveQA is an agent that transforms questions online in order to find the best answers. The agent consists of a Tensorflow model that reformulates questions and an Answer Selection model. It interacts with an environment that contains a question-answering system. The agent queries the environment with variants of a question and calculates a score for the answer against the original question. The model is trained end-to-end using reinforcement learning. This version addresses the [SearchQA](https://arxiv.org/abs/1704.05179) question-answering task, and the environment consists of the Bi-directional Attention Flow ([BiDAF](https://github.com/allenai/bi-att-flow)) model of [Seo et al. (2017)](https://openreview.net/forum?id=HJ0UKP9ge¬eId=HJ0UKP9ge). ## Setup ### Dependencies We require tensorflow and many other supporting libraries. Tensorflow should be installed separately following the docs. To install the other dependencies use \`\`\` pip install -r requirements.txt \`\`\` Note: We only ran this code with Python 2, so Python 3 is not officially supported. ### Data Download the source dataset from [SearchQA](https://github.com/nyu-dl/SearchQA), [GloVe](https://nlp.stanford.edu/projects/glove/), and NLTK corpus and save them in $HOME/data. \`\`\` export DATA_DIR=$HOME/data mkdir $DATA_DIR \`\`\` #### Download Download the SearchQA dataset (~600 MB) for training, testing, and validation here: https://drive.google.com/open?id=1OxRhw81g7amW3aBd_iu2By5THysgr2uv \`\`\` unzip $DATA_DIR/SearchQA.zip -d $DATA_DIR \`\`\` Download GloVe (~850 MB): \`\`\` export GLOVE_DIR=$DATA_DIR/glove mkdir $GLOVE_DIR wget -c http://nlp.stanford.edu/data/glove.6B.zip -O $GLOVE_DIR/glove.6B.zip unzip $GLOVE_DIR/glove.6B.zip -d $GLOVE_DIR \`\`\` Download NLTK (for tokenizer). Make sure that nltk is installed! \`\`\` python -m nltk.downloader -d $HOME/nltk_data punkt \`\`\` Download the reformulator model pretrained on UN+Paralex datasets (~140 MB): \`\`\` export PRETRAINED_DIR=$DATA_DIR/pretrained mkdir $PRETRAINED_DIR wget -c https://storage.googleapis.com/pretrained_models/translate.ckpt-1460356.zip -O $PRETRAINED_DIR/translate.ckpt-1460356.zip unzip $PRETRAINED_DIR/translate.ckpt-1460356.zip -d $PRETRAINED_DIR \`\`\` #### Preprocess The SearchQA dataset requires a 2-step preprocessing: 1. Convert into SQuAD data format as the model was written to only work with that format. \`\`\` export SQUAD_DIR=$DATA_DIR/squad mkdir $SQUAD_DIR python -m searchqa.prepro \ --searchqa_dir=$DATA_DIR/SearchQA \ --squad_dir=$SQUAD_DIR \`\`\` 2. Preprocess the SearchQA dataset in SQuAD format (along with GloVe vectors) and save them in $PWD/data/squad (~60 minutes): \`\`\` python -m third_party.bi_att_flow.squad.prepro \ --glove_dir=$GLOVE_DIR \ --source_dir=$SQUAD_DIR \`\`\` Note that Python2 and Python3 handle Unicode differently and hence the preprocessing output differs. For converting the SearchQA format to SQuAD format either version can be used; use Python3 for other datasets. ### gRPC We need to compile the gRPC interface for the Environment Server. \`\`\` chmod +x compile_protos.sh; ./compile_protos.sh \`\`\` ### Run Environment Server The training requires running the environment [gRPC](https://grpc.io/) server, which receives queries from the ActiveQA agent and sends back one response per query. \`\`\` python -m px.environments.bidaf_server \ --port=10000 \ --squad_data_dir=data/squad \ --bidaf_shared_file=data/bidaf/shared.json \ --bidaf_model_dir=data/bidaf/ \`\`\` The checkpoint of a BiDAF model trained on SearchQA is already provided in data/bidaf, so you don't have to train one yourself. However, if you want to reproduce our training, clone the [BiDAF repository](https://github.com/allenai/bi-att-flow) and run \`\`\` python basic/cli.py \ --mode=trains \ --data_dir=data/squad \ --shared_path=data/bidaf/shared.json \ --init_lr=0.001 \ --num_steps=14000 \`\`\` ### Reformulator Training We first train reformulator from a model pretrained on UN and Paralex datasets. It should take a week on a single P100 GPU to reach ~42 F1 score on SearchQA's dev set. \`\`\` export OUT_DIR=/tmp/active-qa mkdir $OUT_DIR export REFORMULATOR_DIR=$OUT_DIR/reformulator mkdir $REFORMULATOR_DIR echo "model_checkpoint_path: \"$PRETRAINED_DIR/translate.ckpt-1460356\"" > checkpoint cp -f checkpoint $REFORMULATOR_DIR cp -f checkpoint $REFORMULATOR_DIR/initial_checkpoint.txt python -m px.nmt.reformulator_and_selector_training \ --environment_server_address=localhost:10000 \ --hparams_path=px/nmt/example_configs/reformulator.json \ --enable_reformulator_training=true \ --enable_selector_training=false \ --train_questions=$SQUAD_DIR/train-questions.txt \ --train_annotations=$SQUAD_DIR/train-annotation.txt \ --train_data=data/squad/data_train.json \ --dev_questions=$SQUAD_DIR/dev-questions.txt \ --dev_annotations=$SQUAD_DIR/dev-annotation.txt \ --dev_data=data/squad/data_dev.json \ --glove_path=$GLOVE_DIR/glove.6B.100d.txt \ --out_dir=$REFORMULATOR_DIR \ --tensorboard_dir=$OUT_DIR/tensorboard \`\`\` Note: if you don't want to wait a week of training, you can download this [checkpoint of the reformulator](https://storage.cloud.google.com/pretrained_models/translate.ckpt-6156696.zip) trained on SearchQA, with dev set F1 score of 42.5. Note that this is not the exact model analyzed in the paper, but one with equivalent performance. ### Selector Training After training the reformulator, we can now train the selector. It should take 2-3 days on a single P100 GPU to reach ~47.5 F1 score on SearchQA's dev set. \`\`\` python -m px.nmt.reformulator_and_selector_training \ --environment_server_address=localhost:10000 \ --hparams_path=px/nmt/example_configs/reformulator.json \ --enable_reformulator_training=false \ --enable_selector_training=true \ --train_questions=$SQUAD_DIR/train-questions.txt \ --train_annotations=$SQUAD_DIR/train-annotation.txt \ --train_data=data/squad/data_train.json \ --dev_questions=$SQUAD_DIR/dev-questions.txt \ --dev_annotations=$SQUAD_DIR/dev-annotation.txt \ --dev_data=data/squad/data_dev.json \ --glove_path=$GLOVE_DIR/glove.6B.100d.txt \ --batch_size_train=16 \ --batch_size_eval=64 \ --save_path=$OUT_DIR/selector \ --out_dir=$REFORMULATOR_DIR \ --tensorboard_dir=$OUT_DIR/tensorboard \`\`\` Note: If you don't want to wait 2-3 days for the training to finish, you can download a [checkpoint of the selector](https://storage.cloud.google.com/pretrained_models/selector.zip). The checkpoint is trained on SearchQA, achieving an F1 score of ~47.5 on the dev set. ## References This repository relies on the work of the following repositories: * [Neural Machine Translation (seq2seq)](https://github.com/tensorflow/nmt) * [Bi-directional Attention Flow (BiDAF)](https://github.com/allenai/bi-att-flow) * [SentencePiece](https://github.com/google/sentencepiece) and uses data from the following sources: * [GloVe](https://nlp.stanford.edu/projects/glove/) * [SearchQA](https://github.com/nyu-dl/SearchQA) # BibTex \`\`\` @inproceedings\{buck18, author = \{Christian Buck and Jannis Bulian and Massimiliano Ciaramita and Andrea Gesmundo and Neil Houlsby and Wojciech Gajewski and Wei Wang\}, title = \{Ask the Right Questions: Active Question Reformulation with Reinforcement Learning\}, booktitle = \{Sixth International Conference on Learning Representations (ICLR)\}, year = \{2018\}, month = \{May\}, address = \{Vancouver, Canada\}, url = \{https://openreview.net/forum?id=S1CChZ-CZ\}, \} \`\`\`

Prompts

Reviews

Tags

Write Your Review

Detailed Ratings

ALL
Correctness
Helpfulness
Interesting
Upload Pictures and Videos

Name
Size
Type
Download
Last Modified
  • Community

Add Discussion

Upload Pictures and Videos