BiT Medium - Big Transfer, General Visual Representation Learning (Medium)

pip install vectorhub[encoders-image-tfhub]

Details

Release date: 2019-12-24

Vector length: 2048 (default)

Repo: https://github.com/google-research/big_transfer

Paper: https://arxiv.org/abs/1912.11370

Example

#pip install vectorhub[encoders-image-tfhub]
from vectorhub.encoders.image.tfhub import BitMedium2Vec
model = BitMedium2Vec()
sample = model.read('https://getvectorai.com/assets/hub-logo-with-text.png')
model.encode(sample)

Index and search vectors

Index and search your vectors easily on the cloud using 1 line of code!

username = '<your username>'
email = '<your email>'
# You can request an api_key using - type in your username and email.
api_key = model.request_api_key(username, email)

# Index in 1 line of code
items = ['https://getvectorai.com/_nuxt/img/rabbit.4a65d99.png', 'https://getvectorai.com/_nuxt/img/dog-2.b8b4cef.png', 'https://getvectorai.com/_nuxt/img/dog-1.3cc5fe1.png']
model.add_documents(user, api_key, items)

# Search in 1 line of code and get the most similar results.
model.search('https://getvectorai.com/_nuxt/img/dog-1.3cc5fe1.png')

# Add metadata to your search
metadata = [{'animal': 'rabbit', 'hat': 'no'}, {'animal': 'dog', 'hat': 'yes'}, {'animal': 'dog', 'hat': 'yes'}]
model.add_documents(user, api_key, items, metadata=metadata)

Description

Transfer of pre-trained representations improves sample efficiency and simplifies hyperparameter tuning when training deep neural networks for vision. We revisit the paradigm of pre-training on large supervised datasets and fine-tuning the model on a target task. We scale up pre-training, and propose a simple recipe that we call Big Transfer (BiT). By combining a few carefully selected components, and transferring using a simple heuristic, we achieve strong performance on over 20 datasets. BiT performs well across a surprisingly wide range of data regimes -- from 1 example per class to 1M total examples. BiT achieves 87.5% top-1 accuracy on ILSVRC-2012, 99.4% on CIFAR-10, and 76.3% on the 19 task Visual Task Adaptation Benchmark (VTAB). On small datasets, BiT attains 76.8% on ILSVRC-2012 with 10 examples per class, and 97.0% on CIFAR-10 with 10 examples per class. We conduct detailed analysis of the main components that lead to high transfer performance.

Working in Colab

If you are using this in colab and want to save this so you don't have to reload, use:

import os 
os.environ['TFHUB_CACHE_DIR'] = "drive/MyDrive/"
os.environ["TFHUB_MODEL_LOAD_FORMAT"] = "COMPRESSED"