講演抄録/キーワード |
講演名 |
2019-05-31 10:45
Proposal for Automatic Extraction Framework of Superconductors Related Information from Scientific Literature ○Luca Foppiano・Thaer M. Dieb・Akira Suzuki・Masashi Ishii(NIMS) SC2019-1 |
抄録 |
(和) |
The automatic collection of materials information from research papers using Natural Language Processing (NLP) is highly required for rapid materials development using big data, namely materials informatics (MI). The difficulty of this automatic collection is mainly caused by the variety of expressions in the papers, a robust system with tolerance to such variety is required to be developed. In this paper, we report an ongoing interdisciplinary work to construct a system for automatic collection of superconductor-related information from scientific literature using text mining techniques. We focused on the identification of superconducting material names and their critical temperature (Tc) key property. We discuss the construction of a prototype for extraction and linking using machine learning (ML) techniques for the physical information collection. From the evaluation using 500 sample documents, we define a baseline and a direction for future improvements. |
(英) |
The automatic collection of materials information from research papers using Natural Language Processing (NLP) is highly required for rapid materials development using big data, namely materials informatics (MI). The difficulty of this automatic collection is mainly caused by the variety of expressions in the papers, a robust system with tolerance to such variety is required to be developed. In this paper, we report an ongoing interdisciplinary work to construct a system for automatic collection of superconductor-related information from scientific literature using text mining techniques. We focused on the identification of superconducting material names and their critical temperature (Tc) key property. We discuss the construction of a prototype for extraction and linking using machine learning (ML) techniques for the physical information collection. From the evaluation using 500 sample documents, we define a baseline and a direction for future improvements. |
キーワード |
(和) |
material informatics / superconductors / machine learning / nlp / tdm / / / |
(英) |
material informatics / superconductors / machine learning / nlp / tdm / / / |
文献情報 |
信学技報, vol. 119, no. 66, SC2019-1, pp. 1-5, 2019年5月. |
資料番号 |
SC2019-1 |
発行日 |
2019-05-24 (SC) |
ISSN |
Online edition: ISSN 2432-6380 |
著作権に ついて |
技術研究報告に掲載された論文の著作権は電子情報通信学会に帰属します.(許諾番号:10GA0019/12GB0052/13GB0056/17GB0034/18GB0034) |
PDFダウンロード |
SC2019-1 |