Computer vision is the field of artificial intelligence that trains machines to interpret, analyze and understand meaningful information visual inputs such as digital images and videos. Computer vision tasks are conducted as distributed training using deep learning models to handle a diverse range of data.
Conventional distributed training using deep learning models experience issues with data heterogeneity, label deficiency at the edge, robustness, and data privacy. Federated learning is the distributed framework that learns the shared global model by utilizing multiple decentralized datasets on edge devices without exchanging the data.
Computer vision tasks are effectively carried out with the help of federated learning and also accomplish data security. Federated learning applied computer vision tasks are image classification, image segmentation, and object detection. Federated learning performs computer vision tasks proficiently with great data privacy and security by utilizing non-independent and identically distributed data.
• Federated Learning (FL) is an art of trade-offs among many optimization objectives, such as improving model accuracy and personalization, efficiency, and privacy.
• It has the great potential to rescue many interesting computer vision (CV) applications and provides a reliable benchmark framework and end-to-end solutions for real-world vision tasks.
• FL has rarely been demonstrated effectively in advanced computer vision tasks such as image classification, instance segmentation, and object detection.
• Generally, FL always focuses on distributed optimization methods with small-scale datasets.
• However, the research trend in computer vision focuses on large-scale supervised or self-supervised pre-training with efficient CNN or Transformer models, which largely improves the performance of classification tasks on various downstream tasks.
• However, due to the lack of exploration in diverse tasks, FL model performance is far behind centralized training.