At "Hamdeli", AI developers and researchers from around the world come together to share coding challenges, breakthroughs, and creative models. Log in now and join our community.
Today I was struggling with Exploding Gradients while training a language model. Adding Gradient Clipping at 1.0 mostly solved the issue. Uploaded the codes to my GitHub...
For those with limited hardware resources, I highly recommend using the LoRA technique and 8-bit quantization. My VRAM usage went from 24GB down to 8GB!