Mining Text Data: Robust Feature Extraction from Clinical Notes and A Neural Architecture for Summarization

I will present two machine learning methods for mining text data, and discuss their applications in healthcare informatics. Clinical notes are a valuable source of information containing regular assessments of patients' condition in hospitals but contain inconsistent abbreviations and lack the structure of formal documents. I will
describe a technique for extracting features that is robust to such inconsistencies and is found to be effective in classification models used to identify patients at risk of developing unforeseen complications. In the second part of the talk, I will present a new neural sequence-to-sequence model for extractive summarization, that can potentially be used to summarize clinical notes. Extractive summaries comprising a salient subset of input sentences, often also contain important key words. Guided by this principle, we design SWAP-NET that models the interaction of key words and salient sentences using a new two-level pointer network based architecture.


Dr. Vaibhav Rajan is an Assistant Professor at the Department of Information Systems and Analytics (DISA) at the School of Computing, National University of Singapore (NUS). Earlier, he was a Senior Research Scientist at Xerox Research where he led a project on Clinical Decision Support Systems for over four years. He received his PhD and MS degrees from EPFL, Switzerland and BE degree from BITS, Pilani, India, all in Computer Science. He is a recipient of the ERS IASC Young Researchers Award 2014
given by European Regional Section of the International Association for Statistical Computing.

Date & time

11am–12pm 13 Jul 2018


Room:N101 Seminar Room


Dr Vaibhav Rajan


Updated:  1 June 2019/Responsible Officer:  Dean, CECS/Page Contact:  CECS Marketing