6 Question & Answer

The implementation of question-and-answer (Q&A) queries is another valuable functionality provided by the HistText library. This feature enables researchers to target and extract specific content from natural-language texts based on user-defined queries. By formulating questions or prompts, researchers can use the Q&A feature to extract data from documents in natural language. Q&A functions in HistText are particularly effective for retrieving biographical information.

Two models are currently available in HistText: one for Chinese and one for English. You can use the list_qa_models() to list the available models:

histtext::list_qa_models()

6.1 Basic usage

The most basic use is to ask a single question:

imh_en_df <- histtext::search_documents('"member of party"', "imh-en")

histtext::qa_on_corpus(imh_en_df, "What is his full name?", "imh-en")


Alternatively, you can ask multiple variants of a question:

histtext::qa_on_corpus(imh_en_df, c("What is his full name?", "What name?"), "imh-en")

6.2 More complex usage

A more advanced usage of Q&A can be achieved when questions depend on previous questions:

questions <- list("name:full" = c("What is his full name?", "What name?"),
                  "education:location" = c("Where {name:full} study at?", "Where study at?"))
histtext::qa_on_corpus(imh_en_df, questions, "imh-en")


ou can also specify the number of answers that a question should be allowed to produce:

histtext::qa_on_corpus(imh_en_df, questions, "imh-en", max_answers = list("education:location" = 2))


Examples of questions on which models where trained with can be accessed using the following functions:

histtext::biography_questions("en")
histtext::biography_questions("zh")