Notes

Speculative Decoding & Friends

Speculative decoding reduces forward passes through a target model by using a cheaper draft model to guess future tokens, then verifying them in parallel.

note · Mar 2, 2026 · 1 min