Hyperparameter Optimization for Fine-Tuning Pre-trained

An Empirical Study on Hyperparameter Optimization for Fine-Tuning Pre-trained Language Models - 2021

Research Area: Machine Learning

Abstract:

The performance of fine-tuning pre-trained language models largely depends on the hyperparameter configuration. In this paper, we investigate the performance of modern hyperparameter optimization methods (HPO) on fine-tuning pre-trained language models. First, we study and report three HPO algorithms performances on fine-tuning two state-of-the-art language models on the GLUE dataset. We find that using the same time budget, HPO often fails to outperform grid search due to two reasons: insufficient time budget and overfitting. We propose two general strategies and an experimental procedure to systematically troubleshoot HPO-s failure cases. By applying the procedure, we observe that HPO can succeed with more appropriate settings in the search space and time budget; however, in certain cases overfitting remains.

Keywords:

Author(s) Name: Xueqing Liu, Chi Wang

Journal name: Computer Science

Conferrence name:

Publisher name: arXiv:2106.09204

DOI: 10.48550/arXiv.2106.09204

Volume Information:

Paper Link: https://arxiv.org/abs/2106.09204

Office Address

Social List

An Empirical Study on Hyperparameter Optimization for Fine-Tuning Pre-trained Language Models - 2021

Abstract:

S-Logix (OPC) Private Limited

Office Address

An Empirical Study on Hyperparameter Optimization for Fine-Tuning Pre-trained Language Models - 2021

Abstract:

Related Papers