Transformers之问题对答(Question Answering)

1年前浏览460

1 引言

Transformers的环境配置非常麻烦, 尽管花了几个小时试了各种方法, 但仍然没有完全解决问题. 本文仅试验了Transformers之问题对答(Question Answering), 其它功能还没有测试. 在试验之前, 检查了每个模块的安装情况, 如下图所示。

问题对答是信息检索和自然语言处理NLP中的一项任务, 也是NLP中最难处理的一项内容, 该任务要求系统正确回答以人类自然语言提出的问题。在提取性问题解答方案中，通过提供一段文字，使用模型根据上下文来预测答案在段落中的位置。这是一项非常具有挑战性的任务.

PyTextRank---文本关键字(keywords)的自动取出

使用Transformers确定句子之间的相似度

SentenceTransformers库更新V2.0.0

联合6种Transformers预训练模型

2 模型简介

尽管目前Question Answering模型共有307个,但本次测试使用的模型仍然是mrm8488/bert-multi-cased-finetuned-xquadv1. 该模型由谷歌创建，并在XQuAD之类的数据上进行了微调，用于多语言(11种不同的语言)的问答任务。BERT(base-multilingual-cased) fine-tuned for multilingual Q&A. 由于数据集基于SQuAD v1.1，所以数据中没有无法回答的问题, 以便模型可以专注于跨语言的转移。

3 调用方法

调用方法如下:

from transformers import pipeline

# pipeline模块是一个抽象层，提供了简单的API来执行各种任务。

question_answering = pipeline("question-answering",

model="mrm8488/bert-multi-cased-finetuned-xquadv1",

tokenizer="mrm8488/bert-multi-cased-finetuned-xquadv1")

# 构建问答管道

context = '''内容描述'''

question = '''所提问题'''

result = question_answering(question=question, context=context)

4 英语测试

首先对模型用英语进行测试:

内容: '''The development of a step-path failure surface is mainly controlled by the orientation and spatial characteristics of the present major rock structure including major joints sets, shear planes and fault planes. '''

根据上面的描述,提出以下四个问题:

(1) 问题: '''What kinds of factors controlled the development of a step-path failure surface?'''

回答: orientation and spatial characteristics (0.96)

给出的这个答案非常准确, 分数0.96.

(2) 问题: '''Please describe the major rock structure.'''

回答: step-path failure surface is mainly controlled by the orientation and spatial characteristics (0.39)

问题使用了描述形式, 本题没有给出正确答案, 可能的原因一方面没有使用问句形式,另一方面问题中给出的关键词太少, 分数0.39.

(3) 问题: What is rock structure?

回答: step-path failure surface (0.42)

这个问题也没有给出正确答案, 主要原因可能是问题中给出的关键词太少, 分数0.42.

(4) How many kinds of present major rock structure?

回答: major joints sets, shear planes and fault planes (0.003)