Simplifying Vulkan one subsystem at a time

196 points - today at 1:26 PM

Comments

tonis2 today at 6:03 PM

I wish they would just allow us to push everything to GPU as buffer pointers, like buffer_device address extension allows you to, and then reconstruct the data to your required format via shaders.

The GPU programming seems to be both super low level, but also high level, cause textures and descriptors need these ultra specific data format's, and then the way you construct and upload those formats are very complicated and change all the time.

Is there really no way to simplify this ?

Regular vertex data was supposed to be strictly pre formatted in pipeline too, util it was not suddenly, and now we can just give the shader a `device_address`extension memory pointer and construct the data from that.

kvark today at 4:03 PM

The main problem with Vulkan isn't the programming model or the lack of features. These are tackled by Khronos. The problem is with coverage and update distribution. It's all over the place! If you develop general purpose software (like Zed), you can't assume that even the basic things like dynamic rendering are supported uniformly. There are always weird systems with old drivers (looking at Ubuntu 22 LTS), hardware vendors abandoning and forcefully deprecating the working hardware, and of course driver bugs... So, by the time I'm going to be able to rely on the new shiny descriptor heap/buffer features, I'll have more gray hair and other things on the horizon.

pjmlp today at 3:45 PM

At least they are making an effort to correct the extension spaghetti, already worse than OpenGL.

Addiitionally most of these fixes aren't coming into Android, now getting WebGPU for Java/Kotlin[0] after so many refused to move away from OpenGL ES, and naturally any card not lucky to get new driver releases.

Still, better now than never.

[0] - https://developer.android.com/jetpack/androidx/releases/webg...

hmry today at 2:01 PM

I'm really enjoying these changes. Going from render passes to dynamic rendering really simplified my code. I wonder how this new feature compares to existing bindless rendering.

From the linked video, "Feature parity with OpenCL" is the thing I'm most looking forward to.

Animats today at 9:19 PM

Not sure if this is an "oh, no" event.

So this goes into Vulkan. Then it has to ship with the OS. Then it has to go into intermediate layers such as WGPU. Which will probably have to support both old and new mode. Then it has to go into renderers. Which will probably have to support both old and new mode. Maybe at the top of the renderer you can't tell if you're in old or new mode, but it will probably leak through. In that case game engines have to know about this. Which will cause churn in game code.

And Apple will do something different, in Metal.

Unreal Engine and Unity have the staffs to handle this, but few others do. The Vulkan-based renderers which use Vulkan concurrency to get performance OpenGL can't deliver are few. Probably only Unreal Engine and Unity really exploit Vulkan properly.

Here's the top level of the Vulkan changes.[1] It doesn't look simple.

(I'm mostly grumbling because the difficulty and churn in Vulkan/WGPU has resulted in three abandoned renderers in Rust land through developer burnout. I'm a user of renderers, and would like them to Just Work.)

[1] https://docs.vulkan.org/refpages/latest/refpages/source/VK_E...

m-schuetz today at 4:16 PM

I suspect we are only 5-10 years away until Vulkan is finaly usable. There are so many completely needlessly complex things, or things that should have an easy-path for the common case.

BDA, dynamic rendering and shader objects almost make Vulkan bearable. What's still sorely missing is a single-line device malloc, a default queue that can be used without ever touching the queue family API, and an entirely descriptor-free code path. The latter would involve making the NV bindless extension the standard which simply gives you handles to textures, without making you manage descriptor buffers/sets/heaps. Maybe also put an easy-path for synchronization on that list and making the explicit API optional.

Until then I'll keep enjoying OpenGL 4.6, which already had BDA with c-style pointer syntax in glsl shaders since 2010 (NV_shader_buffer_load), and which allows hassle-free buffer allocation and descriptor-set-free bindless textures.

pixelpoet today at 3:37 PM

I would like to / am "supposed to" use Vulkan but it's a massive pain coming from OpenCL, with all kinds of issues that need safe handling which simply don't come from OpenCL workloads.

Everyone keeps telling me OpenCL is deprecated (which is true, although it's also true that it continues to work superbly in 2026) but there isn't a good / official OpenCL to Vulkan wrapper out there to justify it for what I do.

jabl today at 5:48 PM

Does this evolution of the Vulkan API get closer to the model explained in https://www.sebastianaaltonen.com/blog/no-graphics-api which we discussed in https://news.ycombinator.com/item?id=46293062 ?

HexDecOctBin today at 3:23 PM

I personally just switched to using push descriptors everywhere. On desktops, the real world limits are high enough that it end up working out fine and you get a nice immediate mode API like OpenGL.

socalgal2 today at 7:10 PM

Vulkan takes like 600+ lines to do what Metal does in 50.

I'm sure the comments will be all excuses and whys but they're all nonsense. It's just a poorly thought out API.

jauntywundrkind today at 5:43 PM

How are folks feeling about WebGPU these days?

Once Vulkan is finally in good order, descriptor_heap and others, I really really hope we can get a WebGPU.next.

Where are we at with the "what's next for webgpu" post, from 5 quarters ago? https://developer.chrome.com/blog/next-for-webgpu https://news.ycombinator.com/item?id=42209272

janlucien today at 8:33 PM

[dead]

openclawagent13 today at 3:31 PM

[dead]

lucastytthhh today at 3:10 PM

[flagged]

sxzygz today at 7:43 PM

Uuugh, graphics. So many smart people expending great energy to look busy while doing nothing particularly profound.

Graphics people, here is what you need to do.

1) Figure out a machine abstraction.

2) Figure out an abstraction for how these machines communicate with each other and the cpu on a shared memory bus.

3) Write a binary spec for code for this abstract machine.

4) Compilers target this abstract machine.

5) Programs submit code to driver for AoT compilation, and cache results.

6) Driver has some linker and dynamic module loading/unloading capability.

7) Signal the driver to start that code.

AMD64, ARM, and RISC-V are all basically differing binary specs for a C-machine+MMU+MMIO compute abstraction.

Figure out your machine abstraction and let us normies write code that’s accelerated without having to throw the baby out with the bathwater ever few years.

Oh yes, give us timing information so we can adapt workload as necessary to achieve soft real-time scheduling on hardware with differing performance.