Transformer-based Biomedical Pretrained Language Models List

2 minute read

Published:

This information is from the survey paper “AMMU - A Survey of Transformer-based Biomedical Pretrained Language Models”. This survey written by Kalyan et al. introduced a new taxonomy for transformer-based biomedical pretrained language models (T-BPLMs). Here is the list of transformer-based BPLMs with links for the paper and the pretrained model. For detailed information, please refer the survey paper. For the list of transformer-based pretrained language models in all the domains, please refer models list.

EHR text-based T-BPLMs

These models are pretrained on EHR text data.

EHR text-based T-BPLMPaper and Model Link
BioClinicalBERTPaper Model
BioClinicalBERT (discharge)Paper Model
MIMIC-BERTPaper Model
ClinicalXLNet (nursing)Paper Model
ClinicalXLNet (discharge)Paper Model
ClinicalBERTPaper Model
BERT-MIMICPaper Model
XLNet-MIMICPaper Model
RoBERTa-MIMICPaper Model
ELECTRA-MIMICPaper Model
ALBERT-MIMICPaper Model
DeBERTa-MIMICPaper Model
Longformer-MIMICPaper Model
MedBERTPaper
BEHRTPaper
BERT-EHRPaper
AlphaBERTPaper

Radiology reports-based T-BPLMs

These models are pretrained on radiology reports text data.

Radiology reports-based T-BPLMPaper and Model Link
RadBERTPaper
FS-BERTPaper
RAD-BERTPaper

Social Media text-based T-BPLMs

These models are pretrained on health related social media text.

Social Media text-based T-BPLMPaper and Model Link
CT-BERTPaper Model
BioRedditBERTPaper Model
RuDR-BERTPaper Model
EnRuDR-BERTPaper Model
EnDR-BERTPaper Model

Scientific Literature-based T-BPLMs

These models are pretrained on biomedical literature.

Scientific Literature-based T-BPLMPaper and Model Link
BioBERTPaper Model
RoBERTa-base-PMPaper Model
RoBERTa-base-PM-VocPaper Model
PubMedBERT (Abstract)Paper Model
PubMedBERT (Abstract + Fulltext)Paper Model
BioELECTRAPaper Model
BioELECTRA ++Paper Model
OuBioBERTPaper Model
BlueBERT-PMPaper Model
BioMedBERTPaper
ELECTRAMedPaper Model
BioELECTRA-PPaper Model
BioELECTRA-PMPaper Model
BioALBERT-PPaper Model
BioALBERT-PMPaper Model

Hybrid corpora-based T-BPLMs

These models are pretrained on combined hybrid corpora having text from different sources.

Hybrid corpora-based T-BPLMPaper and Model Link
BlueBERT-PM-M3Paper Model
RoBERTabase-PM-M3Paper Model
RoBERTabase-PM-M3- VocPaper Model
BioBERTpt-allPaper Model
BioCharBERTPaper Model
BERT (sP+W+BC)Paper Model
AraBioBERTPaper
SciBERTPaper Model
SciFive-PPaper Model
SciFive-PMPaper Model
BioALBERT-P-M3Paper Model
BioALBERT-PM-M3Paper Model

Ontology Knowledge Injected T-BPLMs

These models are obtained by further pretraining the pretrained models on ontology data.

Ontology Knowledge Injected T-BPLMPaper and Model Link
Clinical Kb-BERTPaper Model
Clinical Kb-ALBERTPaper Model
UmlsBERTPaper Model
CoderBERTPaper Model
CoderBERT-ALLPaper Model
SapBERTPaper Model
SapBERT-XLMRPaper Model
KeBioLMPaper Model

Language-specific T-BPLMs

Language-specific T-BPLMLanguagePaper and Model Link
BERT(jpCR+jpW)JapanesePaper
BioBERTpt-bioPortuguesePaper Model
BioBERTpt-clinPortuguesePaper Model
BioBERTpt-allPortuguesePaper Model
RuDR-BERTRussianPaper Model
EnRuDR-BERTRussianPaper Model
FS-BERTGermanPaper
RAD-BERTGermanPaper
CHMBERTChinesePaper
SpanishBERTSpanishPaper
AraBioBERTArabicPaper
CamemBioBERTFrenchPaper
MC-BERTChinesePaper Model
UTH-BERTJapanesePaper Model
SINA-BERTPersianPaper
mBERT-GalenSpanishPaper Model
BETO-GalenSpanishPaper Model
XLM-R-GalenSpanishPaper Model

Green T-BPLMs

Green T-BPLMPaper and Model Link
GreenBioBERTPaper
exBERTPaper