A standardized benchmark for Out-of-Distribution Detection

OpenOOD aims to provide accurate, standardized, and unified evaluation of OOD detection. There are at least 100+ works on OOD detection in the past 6 years, but it is still unclear which approaches really work since the evaluation setup is highly inconsistent from paper to paper. OpenOOD currently provides 6 benchmarks for OOD detection (4 for standard setting and 2 for full-spectrum setting) in the context of image classification and benchmarks 40 advanced methodologies within our framework. We expect OpenOOD to foster collective efforts in the community towards advancing the state-of-the-art in OOD detection.

News:

  • June 2023: This leaderboard is officially online, along with the v1.5 release of OpenOOD which emphasizes large-scale and full-spectrum OOD detection. Check out our technical report and detailed changelog.

Up-to-date leaderboard based on
35+ methods and their combinations

Carefully designed benchmarks
of various sizes and settings

Light-weight Evaluator


Check out our Colab tutorial.
# !pip install git+https://github.com/Jingkang50/OpenOOD.git
from openood.evaluation_api import Evaluator
from openood.networks import ResNet50
from torchvision.models import ResNet50_Weights
from torch.hub import load_state_dict_from_url

# Load an ImageNet-pretrained model from torchvision
net = ResNet50()
weights = ResNet50_Weights.IMAGENET1K_V1
net.load_state_dict(load_state_dict_from_url(weights.url))
preprocessor = weights.transforms()
net.eval(); net.cuda()

# Initialize an evaluator and evaluate
evaluator = Evaluator(net, id_name='imagenet', 
    preprocessor=preprocessor, postprocessor_name='msp')
metrics = evaluator.eval_ood()

Analysis


Check out our paper with detailed analyses.
full-spectrum results
Available Leaderboards
CIFAR-10 CIFAR-100 ImageNet-200 ImageNet-200 (full-spectrum) ImageNet-1K ImageNet-1K (full-spectrum)

Leaderboard: CIFAR-10

Leaderboard: CIFAR-100

Leaderboard: ImageNet-200

Leaderboard: ImageNet-200 (full-spectrum)

Leaderboard: ImageNet-1K

Leaderboard: ImageNet-1K (full-spectrum)

FAQ

➤ What are the differences between OpenOOD v1.5 and v1.0? 🤔
OpenOOD v1.5 extends its earlier version by 1) including large-scale experiment results on ImageNet, 2) studying full-spectrum detection, and 3) introducing new features such as this leaderboard and the new evaluator. As a result, the leaderboard uniquely accompanies our v1.5 release. Please also see a detailed changelog here.

Citation

Consider citing our papers if you reference our leaderboard or use our implementations in your research:
@article{zhang2023openood,
    title={OpenOOD v1.5: Enhanced Benchmark for Out-of-Distribution Detection},
    author={Zhang, Jingyang and Yang, Jingkang and Wang, Pengyun and Wang, Haoqi and Lin, Yueqian and Zhang, Haoran and Sun, Yiyou and Du, Xuefeng and Zhou, Kaiyang and Zhang, Wayne and Li, Yixuan and Liu, Ziwei and Chen, Yiran and Hai, Li},
    journal={arXiv preprint arXiv:2306.09301},
    year={2023},
}
@inproceedings{yang2022openood,
    title={Open{OOD}: Benchmarking Generalized Out-of-Distribution Detection},
    author={Jingkang Yang and Pengyun Wang and Dejian Zou and Zitang Zhou and Kunyuan Ding and WenXuan Peng and Haoqi Wang and Guangyao Chen and Bo Li and Yiyou Sun and Xuefeng Du and Kaiyang Zhou and Wayne Zhang and Dan Hendrycks and Yixuan Li and Ziwei Liu},
    booktitle={Thirty-sixth Conference on Neural Information Processing Systems Datasets and Benchmarks Track},
    year={2022},
    url={https://openreview.net/forum?id=gT6j4_tskUt}
}

Contribute to OpenOOD!


We welcome any contribution in terms of both new methods and application scenarios. Please check here for more details. Feel free to leave a message by either opening an issue or discussion in our github repo.

Maintainers


  • Jingyang Zhang
  • Jingkang Yang
  • Pengyun Wang