Find Data
1. TensorFlow Datasets
# Install: pip install tensorflow-datasets
import tensorflow_datasets as tfds
mnist_data = tfds.load("mnist")
mnist_train, mnist_test = mnist_data["train"], mnist_data["test"]
assert isinstance(mnist_train, tf.data.Dataset)
2. Data For Everyone
3. Autonomous Driving Dataset
A2D2 is around 2.3 TB in total. It is split by annotation type (i.e. semantic segmentation, 3D bounding box), to break up the download into smaller packages. Each split is packaged into a single tar file, while the remaining unlabelled sequence data is split into multiple tar files.
4. Sound
Domestic environment sound event detection (DESED). > Mix of recorded and synthetic data.
5. NLP dataset
- Huggingface datasets
pip install datasets