Medizinische Universität Graz Austria/Österreich - Forschungsportal - Medical University of Graz

Logo MUG-Forschungsportal

Gewählte Publikation:

SHR Neuro Krebs Kardio Lipid Stoffw Microb

Oleynik, M; Kugic, A; Kasáč, Z; Kreuzthaler, M.
Evaluating shallow and deep learning strategies for the 2018 n2c2 shared task on clinical text classification.
J Am Med Inform Assoc. 2019; 26(11):1247-1254 Doi: 10.1093/jamia/ocz149 [OPEN ACCESS]
Web of Science PubMed PUBMED Central FullText FullText_MUG


Führende Autor*innen der Med Uni Graz
Oleynik Michel
Co-Autor*innen der Med Uni Graz
Kasac Zdenko
Kreuzthaler Markus Eduard
Kugic Amila

Dimensions Citations:

Plum Analytics:

Scite (citation analytics):

Automated clinical phenotyping is challenging because word-based features quickly turn it into a high-dimensional problem, in which the small, privacy-restricted, training datasets might lead to overfitting. Pretrained embeddings might solve this issue by reusing input representation schemes trained on a larger dataset. We sought to evaluate shallow and deep learning text classifiers and the impact of pretrained embeddings in a small clinical dataset. We participated in the 2018 National NLP Clinical Challenges (n2c2) Shared Task on cohort selection and received an annotated dataset with medical narratives of 202 patients for multilabel binary text classification. We set our baseline to a majority classifier, to which we compared a rule-based classifier and orthogonal machine learning strategies: support vector machines, logistic regression, and long short-term memory neural networks. We evaluated logistic regression and long short-term memory using both self-trained and pretrained BioWordVec word embeddings as input representation schemes. Rule-based classifier showed the highest overall micro F1 score (0.9100), with which we finished first in the challenge. Shallow machine learning strategies showed lower overall micro F1 scores, but still higher than deep learning strategies and the baseline. We could not show a difference in classification efficiency between self-trained and pretrained embeddings. Clinical context, negation, and value-based criteria hindered shallow machine learning approaches, while deep learning strategies could not capture the term diversity due to the small training dataset. Shallow methods for clinical phenotyping can still outperform deep learning methods in small imbalanced data, even when supported by pretrained embeddings. © The Author(s) 2019. Published by Oxford University Press on behalf of the American Medical Informatics Association.

Find related publications in this database (Keywords)
natural language processing
data mining
machine learning
deep learning
© Med Uni Graz Impressum