After publishing a series of successful works including ResNet and Faster R-CNN, Kaiming He and his team recently published this new paper demonstrating masked autoencoders (MAE) are scalable self-supervised learners for computer vision.

In self-supervised learning, no label is provided and the learner obtains supervisory signals from the data itself, often leveraging the underlying structure in the data. The general technique of self-supervised learning is to predict any unobserved or hidden part (or property) of the input from any observed or unhidden part of the input.

In this paper, the MAE works in a similar way: During pre-training, a large random subset of image patches (e.g., 75%) is masked out. The encoder is applied to the small subset of visible patches. Mask tokens are introduced after the encoder, and the full set of encoded patches and mask tokens is processed by a small decoder that reconstructs the original image in pixels.

After pre-training, the decoder is discarded and the encoder is then fine-tuned using the labelled dataset and then evaluated on uncorrupted images (full sets of patches) for recognition tasks. They experimented using Vision Transformer(ViT) as the encoder. When comparing to ViT-L trained from stretch, MAE’s training is significantly accelerated (by 3X or more) with improving accuracy.

This proposed MAE can scale to large network and datasets, and this allows for learning high-capacity models that generalize well: e.g., a vanilla ViT-Huge model achieves the best accuracy (87.8%) among methods that use only ImageNet-1K data.



Scalable learners often require a large amount of computing power. Business Systems International (BSI) is the largest Nvidia GPU supplier in Europe and we provide custom solutions of complete AI Machine Learning environments that enable the training of complex machine learning models such as the Transformer.

This article was provided by our AI researcher Bill Shao.



To learn more...

Our AI technology solutions can be viewed here and our AI inception programme here.

Get in touch to discover how we could optimise your business with AI.