Buy Me a Coffee☕
*Memos:
-
My post explains MNIST, EMNIST, QMNIST, ETLCDB, Kuzushiji and Moving MNIST.
-
My post explains Fashion-MNIST, Caltech 101, Caltech 256, CelebA, CIFAR-10 and CIFAR-100.
-
My post explains Oxford-IIIT Pet, Oxford 102 Flower, Stanford Cars, Places365, Flickr8k and Flickr30k.
-
My post explains ImageNet, LSUN and MS COCO.
-
My post explains Image Classification(Recognition), Object Localization, Object Detection and Image Segmentation.
-
My post explains Keypoint Detection(Landmark Detection), Image Matching, Object Tracking, Stereo Matching, Video Prediction, Optical Flow, Image Captioning.
(1) PASCAL VOC(Pattern Analysis, Statistical Modelling, and Computational Learning Visual Object Classes)(2005):
- has object images and annotations with 4, 10 or 20 classes and there are the 8 datasets VOC2005, VOC2006, VOC2007, VOC2008, VOC2009, VOC2010, VOC2011 and VOC2012:
*Memos:
-
VOC2005 has 2,232 images and annotations(some for train, some for validation and some for test) with 4 classes.
-
VOC2006 has 5,304 images and annotations(1,277 for train, 1,341 for validation and 2,686 for test) with 10 classes.
-
VOC2007 has 9,963 images and annotations(2,501 for train, 2,510 for validation and 4,952 for test) with 20 classes.
-
VOC2008 has 5,096 images and annotations(2,111 for train, 2,221 for validation and 764 as extra) with 20 classes. *There are 4,133 images for test in it but just ignore them.
-
VOC2009 has 7,818 images and annotations(3,473 for train, 3,581 for validation and 764 as extra) with 20 classes.
-
VOC2010 has 11,321 images and annotations(4,998 for train, 5,105 for validation and 1,218 as extra) with 20 classes.
-
VOC2011 has 14,961 images and annotations(5,717 for train, 5,823 for validation and 3,421 as extra) with 20 classes.
-
VOC2012 has 17,125 images and annotations(5,717 for train, 5,823 for validation and 5,585 as extra) with 20 classes.
- is VOCSegmentation() and VOCDetection() in PyTorch.
(2) SUN Database(Scene UNderstanding database)(2010):
- has 108,754 scene images with 397 classes.
- is also called SUN397.
- is SUN397() in PyTorch.
(3) Kinetics Dataset(2017):
- has human action short video clips and there are the 3 datasets Kinetics-400, Kinetics-600 and Kinetics-700:
*Memos:
- Each video clip lasts around 10 seconds.
-
Kinetics-400(2017) has 306,245 video clips each connected to the label from 400 categories(classes).
-
Kinetics-600(2018) has 495,547 video clips each connected to the label from 600 categories.
-
Kinetics-700(2019) has 545,317 video clips each connected to the label from 700 categories.
- is used for Video Classification.
- is Kinetics() in PyTorch.
(4) Cityscapes(2016):
- has the 25,000 annotated urban street scene images of semantic understanding with the 30 classes grouped into 8 categories. *5,000 images are fine-annotated and 20,000 images are coarse-annotated.
- is used for Image Segmentation.
- is Cityscapes() in PyTorch. *How to set the dataset isn't explained.
Fine-annotated images:
Coarse-annotated images:
The above is the detailed content of Datasets for Computer Vision (5). For more information, please follow other related articles on the PHP Chinese website!
Statement:The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn