Machine Learning with Cats — Oliver Shackley

Built with AI. The code in this project was written by Claude Code.

I used Claude Code to build a deep learning pipeline that classifies cat breeds using EfficientNetV2 with test-time augmentation. Separately, I trained the same dataset on Google Cloud Vertex AI AutoML to compare the two approaches. Dataset: Oxford Pet IIT, filtered to 12 cat breeds with ~200 images each.

Training pipeline

I used a two-phase training approach: freeze the backbone for initial epochs, then unfreeze for fine-tuning with discriminative learning rates. Early runs hit ~100% training accuracy but only 60-65% validation, a classic overfitting problem. I also ran into Lambda's missing Nvidia drivers, which kept the training falling back to CPU.

After fixing the drivers and adding regularization, the model improved significantly:

Label smoothing. Softened targets to prevent overconfidence.
Mixup / CutMix. Blended images and labels for better generalization.
Cosine LR schedule. Warmup plus cosine decay for stable training.
EMA weights. Averaged parameters over time for smoother convergence.
Discriminative LR. Lower rates for the pretrained backbone, higher for the classifier head.

The biggest improvement came from switching to EfficientNetV2-S at 320px. Cat breed classification is a fine-grained task, so the higher resolution helped distinguish subtle differences between breeds.

Results

Final model: 95% macro F1 with strong balance across all 12 breeds. Test-time augmentation added an extra 1-3% accuracy boost. You can test the model yourself here.

Confusion Matrix showing model performance across cat breeds — Confusion matrix across the 12 cat breeds, after test-time augmentation.

The Vertex AI AutoML model trained from scratch over 24 hours and also performed well:

PR AUC: 0.984
Precision: 97.3%
Recall: 90.7%

Tech stack

Python, PyTorch, timm — model training
EfficientNetV2-S at 320px — backbone architecture
Google Cloud Vertex AI — AutoML benchmark
Lambda Labs — GPU compute