Publications

Recognising Biomedical Names: Challenges and Solutions

Xiang Dai
PhD thesis, 2021, University of Sydney. arXiv

C: Conference; J: Journal; W: Workshop; P: Preprint [8 + 1 + 6 + 1 = 16]

[W6] Detecting Entities in the Astrophysics Literature: A Comparison of Word-based and Span-based Entity Recognition Methods

Xiang Dai and Sarvnaz Karimi
AACL-IJCNLP Workshop on Information Extraction from Scientific Publications (WIESP 2022). Talk, Poster.

[P1] An Exploration of Hierarchical Attention Transformers for Efficient Long Document Classification

Ilias Chalkidis, Xiang Dai, Manos Fergadiotis, Prodromos Malakasiotis and Desmond Elliott
arXiv, 2210.05529.

[C8] Revisiting Transformer-based Models for Long Document Classification

Xiang Dai, Ilias Chalkidis, Sune Darkner and Desmond Elliott
Findings of the Association for Computational Linguistics: EMNLP 2022.

[C7] mDAPT: Multilingual Domain Adaptive Pretraining in a Single Model

Rasmus Kær Jørgensen, Mareike Hartmann, Xiang Dai and Desmond Elliott
Findings of the Association for Computational Linguistics: EMNLP 2021. Bibtex

[C6] SearchEHR: A Family History Search System for Clinical Decision Support

Xiang Dai, Maciej Rybinski and Sarvnaz Karimi
International Conference on Information and Knowledge Management (CIKM 2021).

[J1] Extracting Family History Information From Electronic Health Records: Natural Language Processing Analysis

Maciej Rybinski, Xiang Dai, Sonit Singh, Sarvnaz Karimi and Anthony Nguyen
JMIR Medical Informatics.

[C5] An Analysis of Simple Data Augmentation for Named Entity Recognition

Xiang Dai and Heike Adel
International Conference on Computational Linguistics (COLING 2020). Code, Bibtex, Poster.

[C4] Cost-effective Selection of Pretraining Data: A Case Study of Pretraining BERT on Social Media

Xiang Dai, Sarvnaz Karimi, Ben Hachey and Cecile Paris
Findings of the Association for Computational Linguistics: EMNLP 2020. Resources, Bibtex.

[W5] NLNDE at CANTEMIST: Neural Sequence Labeling and Parsing Approaches for Clinical Concept Extraction

Lukas Lange, Xiang Dai, Heike Adel, Jannik Strötgen
Iberian Languages Evaluation Forum (IberLEF 2020).

[C3] An Effective Transition-based Model for Discontinuous NER

Xiang Dai, Sarvnaz Karimi, Ben Hachey and Cecile Paris
The Annual Meeting of the Association for Computational Linguistics (ACL 2020). Code, Talk, Bibtex.

[C2] NNE: A Dataset for Nested Named Entity Recognition in English Newswire

Nicky Ringland, Xiang Dai, Ben Hachey, Sarvnaz Karimi, Cecile Paris and James R. Curran
The Annual Meeting of the Association for Computational Linguistics (ACL 2019). Resources, Poster, Bibtex.

[C1] Using Similarity Measures to Select Pretraining Data for NER

Xiang Dai, Sarvnaz Karimi, Ben Hachey, Cecile Paris
Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2019). Slides, Poster, Resources, Code, Bibtex.

[W4] Shot Or Not: Comparison of NLP Approaches for Vaccination Behaviour Detection

Aditya Joshi, Xiang Dai, Sarvnaz Karimi, Ross Sparks, Cecile Paris, C Raina MacIntyre
EMNLP Workshop on Social Media Mining for Health Applications (SMM4H 2018). Bibtex.

[W3] Recognizing Complex Entity Mentions: A Review and Future Directions

Xiang Dai
ACL Student Research Workshop (ACL-SRW 2018). Poster, Bibtex.

[W2] Medication and Adverse Event Extraction from Noisy Text

Xiang Dai, Sarvnaz Karimi, Cecile Paris
Australasian Language Technology Association Workshop (ALTA 2017). Slides, Bibtex.

[W1] Automatic Diagnosis Coding of Radiology Reports: A Comparison of Deep Learning and Conventional Classification Methods

Sarvnaz Karimi, Xiang Dai, Hamed Hassanzadeh, Anthony Nguyen
ACL Workshop on Biomedical Natural Language Processing (BioNLP 2017). Poster, Bibtex.