ML / Data ScienceCompleted2026

Deep Learning from Scratch

FC · RNN/LSTM · U-Net · GPT, from scratch

ML / Data Science · PreviewDeep Learning from ScratchFC · RNN/LSTM · U-Net · GPT, from scratch

Graduate DL coursework with trained weights bundled; source private.

Graduate deep-learning coursework rebuilt as one portfolio piece spanning four architectures, each written from scratch rather than imported. HW1 implements fully-connected layers as a custom torch.autograd.Function with manually derived forward/backward passes (verified analytic gradients), solving XOR and Iris. HW4 hand-codes a recurrent cell and an LSTM cell (manual input/forget/output/cell gates) for character-level generation: a 66K-param RNN and an 873K-param LSTM at hidden=256/2 layers. HW3 is a 36-class semantic segmenter: a 17-layer fully-convolutional baseline versus an 89-layer ResNet-18 encoder + U-Net decoder with skip connections, class-weighted label-smoothed loss, full D8 augmentation, TTA and a 600-epoch cosine schedule. HW5 builds a decoder-only GPT transformer (multi-head self-attention, positional embeddings) trained on WikiText-2 with the GPT-2 BPE tokenizer (50,257-token vocab); the improved 30.5M-param model (d_model 256, 8 heads, 6 layers) cuts test perplexity from 269.4 (13.7M base) to 178.5.

Python
PyTorch
Transformers
GPT-2 BPE
WikiText-2
ResNet
U-Net
LSTM
Slurm / HPC

GPT test perplexity: 269.4 → 178.5
Improved GPT params: 30.5M
From-scratch LSTM: 873K params
Segmentation: 36 classes · 89-layer U-Net

Request access

Want something like this? Get in touch →