WordNet and synsets
Each class corresponds to a synset ID (e.g. n01440764 “tench”). Hierarchical relations in WordNet are not always reflected in flat 1-of-K training—hierarchical loss is optional research direction.
ILSVRC tasks
Beyond single-label classification, historical challenges included localization (bbox) and detection. Today, COCO is more common for detection benchmarks, while ImageNet-pretrained backbones remain the default initialization.
Using label names in code
from torchvision.models import resnet50, ResNet50_Weights
w = ResNet50_Weights.IMAGENET1K_V2
names = w.meta["categories"] # list of 1000 strings
# idx = logits.argmax(dim=1).item()
# print(names[idx])
Different weight versions may share the same 1k label order—confirm in the weights metadata you load.
Transfer learning
Replace the classifier head, freeze early layers optionally, train on your domain. Features are biased toward ImageNet objects; medical or industrial imagery may need more adaptation or different pretraining (SimCLR, CLIP, domain-specific data).
Takeaways
- ImageNet scale enabled modern CNNs; data governance and consent practices have evolved since early collection.
- Top-1 / top-5 error reported on fixed val split—do not compare to your test set blindly.
- Consider newer pretraining (web-scale contrastive) when label semantics differ.