Workshop: Machine Learning on HPC Systems (MLHPCS)

View on GitHub

Impact of large-scale pre-training on intra- and inter-domain transfer learning in full and few-shot regimes


Transfer learning aims on exploiting models pre-trained on large amounts of source data for re-use on wide range of target downstream tasks and dataset, and it has been successfully employed to enable training with small target data sizes. Recent line of work posits strong benefits for model generalization and transfer when model size, data size and compute budget are increased for the pre-training. It remains however still largely unclear how the observed transfer improvement due to increase in scale also depends on degree to which source and target datasets are related to each other. We will review recent evidence of large-scale pre-training impact on full and few-shot transfer learning in intra- and inter-domain scenarios, motivating necessity for systematic experiments that may deliver scaling laws for transfer performance dependent on model, data size, compute budget, composition of large source dataset used for pre-training and degree of alignment between source and target datasets. Such experiments require vast compute resources and proper utilization of supercomputing facilities. As an outlook, we will introduce COVIDNetX initiative that aims on studying large-scale intra- and inter-domain transfer learning in a specific use case where relevant pattern detection is performed on target medical imaging datasets that have much smaller size than large source data used during pre-training.

Speaker: Jenia Jitsev, Juelich Supercomputer Center, Helmholtz AI, Research Center Juelich