Abstract: Speculative decoding has proven to be a powerful method for speeding up autoregressive inference by allowing tokens to be generated in parallel using a draft-then-verify approach. However, ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results