Frozen Feature Augmentation
for Few-Shot Image Classification

1Google DeepMind     2Technische Universität Braunschweig
*Andreas did the work during an internship at Google DeepMind.    Manoj lead the project.

IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2024

Abstract

Training a linear classifier or lightweight model on top of pretrained vision model outputs, so-called 'frozen features', leads to impressive performance on a number of downstream few-shot tasks. Currently, frozen features are not modified during training. On the other hand, when networks are trained directly on images, data augmentation is a standard recipe that improves performance with no substantial overhead. In this paper, we conduct an extensive pilot study on few-shot image classification that explores applying data augmentations in the frozen feature space, dubbed 'frozen feature augmentation (FroFA)', covering twenty augmentations in total. Our study demonstrates that adopting a deceptively simple point-wise FroFAs, such as brightness, can improve few-shot performance consistently across three network architectures, three large pretraining datasets, and eight transfer datasets.

Method Overview

Investigating 18+2 Frozen Feature Augmentations (FroFAs)

We first investigate eighteen standard image augmentations in a frozen feature setup using an L/16 ViT pretrained on JFT-3B. In these experiments, we cache features on few-shotted ILSVRC-2012 and train a lightweight model on top of augmented frozen features. We identify three strong augmentations: brightness, contrast, and posterize. Results for two additional augmentations are provided in the supplementary.

A Closer Look on Brightness, Contrast, and Posterize FroFAs

We apply brightness, contrast, and posterize in a feature-wise manor (c or c2). Again, we use an L/16 ViT pretrained on JFT-3B, cache features on few-shotted ILSVRC-2012, and train a lightweight model on top of augmented frozen features. We identify brightness c2FroFA as the best working FroFA.

Our Best Working FroFA Yields Strong Performance Across Architectures, Pretraining Datasets, and Few-Shot Datasets

BibTeX


@InProceedings{Baer2024,
  author    = {Andreas B\"ar and Neil Houlsby and Mostafa Dehghani and Manoj Kumar},
  booktitle = {Proc.\ of CVPR},
  title     = {{Frozen Feature Augmentation for Few-Shot Image Classification}},
  month     = jun,
  year      = {2024},
  address   = {Seattle, WA, USA},
}