2025
an archive of posts from this year
Sep 8, 2025 | Hill climb on MBPP using verifiers |
---|---|
Aug 1, 2025 | From scratch: SFT and GRPO on Qwen 2.5 1.5B Math |
Jul 26, 2025 | Intuiting Policy Gradient methods |
Jul 25, 2025 | LLM Agent ~ Reddit Consensus |
May 5, 2025 | Neural Network precision pitfalls in the wild |
Jan 16, 2025 | Multi-Class Boundary Extraction from Implicit Representations |