I figured it out - -bench/-str breakage was caused by autosave enabled in the ini. I am not sure whether this is intended or not but it is out of scope of the issue at hand. Now, back to benchmarks. With wayland I am getting:
$ mame -video bgfx -bgfx_backend vulkan -nowindow -bench 90 -noautosave umk3
Average speed: 731.03% (89 seconds)
$ mame -video bgfx -bgfx_backend vulkan -nowindow -str 90 -nothrottle -sound none -noautosave umk3
Average speed: 54.73% (89 seconds)
$ mame -video bgfx -bgfx_backend vulkan -window -bench 90 -noautosave umk3
Average speed: 748.82% (89 seconds)
$ mame -video bgfx -bgfx_backend vulkan -window -str 90 -nothrottle -sound none -noautosave umk3
Average speed: 444.91% (89 seconds)
$ mame -video bgfx -bgfx_backend vulkan -bgfx_screen_chains default -nowindow -bench 90 -noautosave umk3
Average speed: 728.93% (89 seconds)
$ mame -video bgfx -bgfx_backend vulkan -bgfx_screen_chains default -nowindow -str 90 -nothrottle -sound none -noautosave umk3
Average speed: 109.12% (89 seconds)
$ mame -video bgfx -bgfx_backend vulkan -bgfx_screen_chains default -window -bench 90 -noautosave umk3
Average speed: 739.98% (89 seconds)
$ mame -video bgfx -bgfx_backend vulkan -bgfx_screen_chains default -window -str 90 -nothrottle -sound none -noautosave umk3
Average speed: 640.75% (89 seconds)
It appears that it is the video output, scaling to full screen and hlsl screen chain killing the performance.
ETA1: same results when running under X.org:
$ mame -video bgfx -bgfx_backend vulkan -nowindow -bench 90 -noautosave umk3
Average speed: 736.35% (89 seconds)
$ mame -video bgfx -bgfx_backend vulkan -nowindow -str 90 -nothrottle -sound none -noautosave umk3
Average speed: 113.99% (89 seconds)
$ mame -video bgfx -bgfx_backend vulkan -window -bench 90 -noautosave umk3
Average speed: 734.86% (89 seconds)
$ mame -video bgfx -bgfx_backend vulkan -window -str 90 -nothrottle -sound none -noautosave umk3
Average speed: 447.35% (89 seconds)
$ mame -video bgfx -bgfx_backend vulkan -bgfx_screen_chains default -nowindow -bench 90 -noautosave umk3
Average speed: 741.29% (89 seconds)
$ mame -video bgfx -bgfx_backend vulkan -bgfx_screen_chains default -nowindow -str 90 -nothrottle -sound none -noautosave umk3
Average speed: 657.48% (89 seconds)
$ mame -video bgfx -bgfx_backend vulkan -bgfx_screen_chains default -window -bench 90 -noautosave umk3
Average speed: 744.40% (89 seconds)
$ mame -video bgfx -bgfx_backend vulkan -bgfx_screen_chains default -window -str 90 -nothrottle -sound none -noautosave umk3
Average speed: 684.39% (89 seconds)
Unsurprisingly, the -bench runs are more or less the same speed. Scaling to full screen is what appears to cause a huge performance hit on wayland as using hlsl or default screen chain appears to have the same cost on both X.org and wayland when running in windowed mode.
ETA2: Switching the bgfx backend to opengl yields the following on X.org:
$ mame -video bgfx -bgfx_backend opengl -bgfx_screen_chains default -nowindow -str 90 -nothrottle -sound none -noautosave umk3
Average speed: 677.78% (89 seconds)
$ mame -video bgfx -bgfx_backend opengl -bgfx_screen_chains hlsl -nowindow -str 90 -nothrottle -sound none -noautosave umk3
Average speed: 152.49% (89 seconds)
and the following on Wayland:
$ mame -video bgfx -bgfx_backend opengl -bgfx_screen_chains default -nowindow -str 90 -nothrottle -sound none -noautosave umk3
Average speed: 676.23% (89 seconds)
$ mame -video bgfx -bgfx_backend opengl -bgfx_screen_chains hlsl -nowindow -str 90 -nothrottle -sound none -noautosave umk3
Average speed: 152.66% (89 seconds)
So opengl backend performance appears to be independent of which windowing system is being used.