I am a master student at the School of Computer Science of McGill University, supervised by Prof. Yue Li. I received my bachelor degree in electronic information engineering at Wuhan University, China. My research interests include Natural Language Processing (NLP) with external knowledge, and Machine learning and NLP in healthcare.
My current research focuses on machine learning, specifically NLP, in healthcare. I led the construction of MeDAL, a large (14 million samples) medical abbreviation disambiguation dataset designed for better NLP models pretraining for medical applications, which is published at EMNLP 2020 Clinical NLP. The code and data can be found here. I contributed to the development of a new type of topic model that addresses multi-modality of Electronic Health Records, which was published in Nature Communications. In addition, I participated in Neurips 2019 Reproducibility Challenge. Recently, I am a leading member of a project funded by the Canadian government in response to COVID-19, aiming at developing better media news surveillance tools for detecting and analyzing public health crises. This project is ongoing, and some preliminery results were published at ACM-BCB.
Before joining McGill, I was a R&D intern at the algorithm research & development department of Horizon Robotics. I developed internal systems for large-scale evaluation of audio recordings’ quality basing on audio and textual features. Prior to that, I was a research intern at University of Toronto, working on quantitative analysis of fMRI data for clinical applications, supervised by Dr. Andrea Kassner of University of Toronto and The Hospital for Sick Children.
I am a fan of Pink Floyd. I listen to jazz in the morning with coffee, post-rock or Pink Floyd when I am thinking (e.g. reading papers or coding), and shoegazing whenever I feel the need. In my free time, which I do not have much lately, I (used to) play soccer and (more recently) try to learn to play guitar.
MSc in Computer Science, 2021 (Expected)
McGill University
BEng in Electronic Information Engineering, 2019
Wuhan University, China
Empirical study of scenarios where deep NLP models are not favorable over classical models
Utilizing Kalman filter and FSM to analyze data from accelerometer for elevator movement monitoring
Speakers separation with RNN in a monaural setting
Introducing Gaussian Field Estimator to achieve robust registration of retinal images from different views and devices
Speech enhancement with Kalman Filter and Linear Prediction Coding in a noisy setting
Ablation study of a Neurips 2019 paper, submitted to the official challenge
A fourteen-million articles medical text dataset for medical NLP pretraining