High-Density Pre-Training Corpora for LLMs
Cut training costs by 40% while improving performance. Palladium Data treats information processing as a physics problem. We provide high-density, entropy-filtered datasets with signal-rich content. Our corpora enable models to achieve lower loss with significantly less compute.