It takes a night and day to fly from Canberra to Turku; the former capital of Finland, founded in the 13th century at the mouth of the Aura River. If turning to Wikipedia for further travel information, we will have over 20 pages of text to read. This document summarisation problem calls for natural language processing, but what else can this timely computational modelling method offer?
Let us assume an acute care clinician using the PubMed® search engine to learn about the topic of "pulmonary arterial hypertension". This search returns immense 10,694 abstracts from the MEDLINE® database. If limiting the search to reviews, the number of 4,449 returned abstracts remains overwhelming. The document summarisation problem is not resolved even after further restricting the search to reviews published within the last five years only (933 returned abstracts).
During this talk, we will introduce machine learning-based natural language processing by surveying applications of keyword extraction and other document summarisation methods to medical data analytics as examples. We will first present a timeline of clinical language processing from 1970s to 2010s, leading to its current paradigms. We will then learn how to teach a computer to summarise this information as the top keywords and evaluate if the keywords are any good. After this, we will discuss how healthcare providers could benefit from these methods. For example, we will consider the task of filling out a shift- change handover form automatically in hospitals for clinical proofing as a way to make documentation more efficient, improve availability of existing documents, and thereby contribute to health and healthcare. Finally, we will learn more about Turku — Associate Professor Suominen’s academic origin — and her way from there to Canberra to bridge the gap between computer and health sciences.
Associate Professor Hanna Suominen, with over 15 years’ experience in longitudinal, multimodal data analytics for saving, structuring, and summarising data, is bridging the gap between Computer Science (CS) and health/social sciences. Her MSc was awarded in applied mathematics, PhD in CS, and Adj. Prof. in CS in the University of Turku, Finland in 2005, 2009, and 2013, respectively. She joined The ANU and Data61 as the Team Leader of TAMPA, Theory and Applications of Multimodal Pattern Analysis within the Machine Learning (ML) Group after working in Data61/NICTA as a Team Leader of Natural Language Processing (NLP) and Senior Researcher in ML. Hanna has over 100 publications with 60 co-authors from 10 countries, including Harvard, Karolinska Institutet, and Max Planck. Her work has been published in the most prestigious journals, cited over 1,200 times, and awarded for best papers, ML/NLP-methods, business-plans, and teaching-units. She has scored competitive grants with a total value of over $10-20 million in the past 3 years alone. Currently she acts as the Big Data program leader of the inaugural ANU Grand Challenge Program, Our Health in Our Hands and is a co-inventor of PostAc®, a smart search-engine to get PhD graduates great jobs outside academia.