Kg-rank: Enhancing large language models for medical qa with knowledge graphs and ranking techniques

R Yang, H Liu, E Marrese-Taylor, Q Zeng… - arXiv preprint arXiv …, 2024 - arxiv.org
arXiv preprint arXiv:2403.05881, 2024arxiv.org
Large language models (LLMs) have demonstrated impressive generative capabilities with
the potential to innovate in medicine. However, the application of LLMs in real clinical
settings remains challenging due to the lack of factual consistency in the generated content.
In this work, we develop an augmented LLM framework, KG-Rank, which leverages a
medical knowledge graph (KG) along with ranking and re-ranking techniques, to improve
the factuality of long-form question answering (QA) in the medical domain. Specifically …
Large language models (LLMs) have demonstrated impressive generative capabilities with the potential to innovate in medicine. However, the application of LLMs in real clinical settings remains challenging due to the lack of factual consistency in the generated content. In this work, we develop an augmented LLM framework, KG-Rank, which leverages a medical knowledge graph (KG) along with ranking and re-ranking techniques, to improve the factuality of long-form question answering (QA) in the medical domain. Specifically, when receiving a question, KG-Rank automatically identifies medical entities within the question and retrieves the related triples from the medical KG to gather factual information. Subsequently, KG-Rank innovatively applies multiple ranking techniques to refine the ordering of these triples, providing more relevant and precise information for LLM inference. To the best of our knowledge, KG-Rank is the first application of KG combined with ranking models in medical QA specifically for generating long answers. Evaluation on four selected medical QA datasets demonstrates that KG-Rank achieves an improvement of over 18% in ROUGE-L score. Additionally, we extend KG-Rank to open domains, including law, business, music, and history, where it realizes a 14% improvement in ROUGE-L score, indicating the effectiveness and great potential of KG-Rank.
arxiv.org
顯示最佳搜尋結果。 查看所有結果