Active machine learning identifies high accurate models by labeling the data dynamically and incrementally with less training and reducing labeling costs. Active learning maximizes the performance gain of the model with the use of best-annotated samples as possible. Most classic active learning approaches deal with centralized unlabeled data in which the unlabeled are together in one place. Active learning faces difficulty in handling distributed data. Due to technology development, a massive amount of data are spread through machines and data centers all over the world.
Distributed systems are required to handle huge amounts of data by distributed data storage and data processing. Emerge of Distributed Active Learning (DAL) is to process the distributed data and iteratively obtain optimal labeling with a high accuracy model. DAL comprises of distributed sample selection strategy and a distributed classification algorithm. Some distributed algorithms are distributed uncertainty sampling algorithm, proximity matrix construction, and distributed density weighting algorithm. Application areas of DAL are the Internet of things(IoT), computer vision, healthcare, and many more. Recent development in DAL is AutoDAL, in which distributed active learning is utilized with automatic hyperparameter selection. The future scope of DAL is integrating reinforcement learning in active learning for distributed contexts.