Locuza
Lötkolbengott/-göttin
Das ist übrigens keine Spekulation, dass steht im Pascal White-Paper, siehe Dev-Blog mit allen nötigen Verlinkungen:Dies erwähnte Locuza im Zusammenhang mit Pascal Spekulationen die aktuell kursieren, dass man die context switch Kosten eventuell reduzieren konnte.
Inside Pascal: NVIDIA's Newest Computing Platform | Parallel Forall
White Paper schrieb:Compute Preemption is another important new hardware and software feature added to GP100 that allows compute tasks to be preempted at instruction-level granularity, rather than thread block granularity as in prior Maxwell and Kepler GPU architectures. Compute Preemption prevents long-running applications from either monopolizing the system (preventing other applications from running) or timing out. Programmers no longer need to modify their long-running applications to play nicely with other GPU applications. With Compute Preemption in GP100, applications can run as long as needed to process large datasets or wait for various conditions to occur, while scheduled alongside other tasks. For example, both interactive graphics tasks and interactive debuggers can run in concert with long-running compute tasks.
White Paper schrieb:In contrast, the Kepler [Und auch Maxwell] GPU architecture only provided coarser-grained preemption at the level of a block of threads in a compute kernel. This block-level preemption required that all threads of a thread block complete before the hardware can context switch to a different context. However when using a debugger and a GPU breakpoint was hit on an instruction within the thread block, the thread block was not complete, preventing block-level preemption.
Das behandelt nur Compute-Aufgaben, aber Rendering-Instructions sollten ebenfalls profitieren.
