Token Canvas와 Denoising Step

dLLM을 이해하려면 token canvas를 먼저 잡아야 한다.

canvas는 dLLM이 한 번에 다루는 고정 길이 token 공간이다.

canvas_length = 8

[_, _, _, _, _, _, _, _]

Denoising step

denoising step은 canvas 안의 token 후보들을 다시 평가하고 갱신하는 단계다.

step 0: [?, ?, ?, ?, ?, ?, ?, ?]
step 1: [?, the, ?, cat, ?, ?, mat, ?]
step 2: [on, the, ?, cat, sat, ?, mat, ?]

이 과정에서 모델은 canvas position마다 token distribution을 만든다.

하지만 매 denoising step마다 모든 token이 확정되어 사용자에게 출력되는 것은 아니다.

commit step은 canvas에서 충분히 확정된 token들을 output으로 내보내는 단계다.

denoise:
canvas를 고친다.

commit:
확정 token을 출력한다.

그래서 dLLM serving metric에는 단순 output token throughput 외에 denoising step 수, canvas position 수, committed token 수가 중요해진다.

canvas_length는 한 번에 몇 token 위치를 denoising할지 정한다.

너무 짧으면 자주 commit해야 하고, 너무 길면 한 step의 계산과 state가 커질 수 있다.

작은 canvas:
짧은 block을 자주 처리

큰 canvas:
더 많은 position을 한 번에 denoise

이 값은 품질, latency, throughput의 trade-off가 된다.