Codú
Prefill Is Compute-Bound. Decode Is Memory-Bound. Why Your GPU Shouldn’t Do Both. | shared by Towards Data Science | Codú