Quantcast
Jump to content

New Game Changing Vulkan Extensions for Mobile: Buffer Device Address


Recommended Posts

2021-07-06-01-banner.jpg

The Samsung Developers team works with many companies in the mobile and gaming ecosystems. We're excited to support our partner, Arm, as they bring timely and relevant content to developers looking to build games and high-performance experiences. This Vulkan Extensions series will help developers get the most out of the new and game-changing Vulkan extensions on Samsung mobile devices.

Android R is enabling a host of useful Vulkan extensions for mobile, with three being key 'game changers'. These are set to improve the state of graphics APIs for modern applications, enabling new use cases and changing how developers can design graphics renderers going forward. You can expect to see these features across a variety of Android smartphones, such as the new Samsung Galaxy S21, and existing Samsung Galaxy S models like the Samsung Galaxy S20. The first blog explored the first game changer extension for Vulkan – ‘Descriptor Indexing'. This blog explores the second game changer extension – ‘Buffer Device Address.’

VK_KHR_buffer_device_address

VK_KHR_buffer_device_address is a monumental extension that adds a unique feature to Vulkan that none of the competing graphics APIs support.

Pointer support is something that has always been limited in graphics APIs, for good reason. Pointers complicate a lot of things, especially for shader compilers. It is also near impossible to deal with plain pointers in legacy graphics APIs, which rely on implicit synchronization.

There are two key aspects to buffer_device_address (BDA). First, it is possible to query a GPU virtual address from a VkBuffer. This is a plain uint64_t. This address can be written anywhere you like, in uniform buffers, push constants, or storage buffers, to name a few.

The key aspect which makes this extension unique is that a SPIR-V shader can load an address from a buffer and treat it as a pointer to storage buffer memory immediately. Pointer casting, pointer arithmetic and all sorts of clever trickery can be done inside the shader. There are many use cases for this feature. Some are performance-related, and some are new use cases that have not been possible before.

Getting the GPU virtual address (VA)

There are some hoops to jump through here. First, when allocating VkDeviceMemory, we must flag that the memory supports BDA:

VkMemoryAllocateInfo info = {…};
VkMemoryAllocateFlagsInfo flags = {…};
flags.flags = VK_MEMORY_ALLOCATE_DEVICE_ADDRESS_BIT_KHR;
vkAllocateMemory(device, &info, NULL, &memory);

Similarly, when creating a VkBuffer, we add the VK_BUFFER_USAGE_SHADER_DEVICE_ADDRESS_BIT_KHR usage flag. Once we have created a buffer, we can query the VA:

VkBufferDeviceAddressInfoKHR info = {…};
info.buffer = buffer;
VkDeviceSize va = vkGetBufferDeviceAddressKHR(device, &info);

From here, this 64-bit value can be placed in a buffer. You can of course offset this VA. Alignment is never an issue as shaders specify explicit alignment later.

A note on debugging

When using BDA, there are some extra features that drivers must support. Since a pointer does not necessarily exist when replaying an application capture in a debug tool, the driver must be able to guarantee that virtual addresses returned by the driver remain stable across runs. To that end, debug tools supply the expected VA and the driver allocates that VA range. Applications do not care that much about this, but it is important to note that even if you can use BDA, you might not be able to debug with it.

typedef struct VkPhysicalDeviceBufferDeviceAddressFeatures {
    VkStructureType  sType;
    void*                     pNext;
    VkBool32              bufferDeviceAddress;
    VkBool32              bufferDeviceAddressCaptureReplay;
    VkBool32              bufferDeviceAddressMultiDevice;
} VkPhysicalDeviceBufferDeviceAddressFeatures;

If bufferDeviceAddressCaptureReplay is supported, tools like RenderDoc can support BDA.

Using a pointer in a shader

In Vulkan GLSL, there is the GL_EXT_buffer_reference extension which allows us to declare a pointer type. A pointer like this can be placed in a buffer, or we can convert to and from integers:

#version 450
#extension GL_EXT_buffer_reference : require
#extension GL_EXT_buffer_reference_uvec2 : require
layout(local_size_x = 64) in;

 // These define pointer types.
layout(buffer_reference, std430, buffer_reference_align = 16) readonly buffer ReadVec4
{
    vec4 values[];
};

 layout(buffer_reference, std430, buffer_reference_align = 16) writeonly buffer WriteVec4
{
    vec4 values[];
};

 layout(buffer_reference, std430, buffer_reference_align = 4) readonly buffer UnalignedVec4
{
    vec4 value;
};

 layout(push_constant, std430) uniform Registers
{
     ReadVec4 src;
    WriteVec4 dst;
} registers;

Placing raw pointers in push constants avoids all indirection for getting to a buffer. If the driver allows it, the pointers can be placed directly in GPU registers before the shader begins executing.

Not all devices support 64-bit integers, but it is possible to cast uvec2 <-> pointer. Doing address computation like this is fine.

uvec2 uadd_64_32(uvec2 addr, uint offset)
{
    uint carry;
    addr.x = uaddCarry(addr.x, offset, carry);
    addr.y += carry;
    return addr;
}

void main()
{
    uint index = gl_GlobalInvocationID.x;
    registers.dst.values[index] = registers.src.values[index];
     uvec2 addr = uvec2(registers.src);
    addr = uadd_64_32(addr, 20 * index);

Cast a uvec2 to address and load a vec4 from it. This address is aligned to 4 bytes.

    registers.dst.values[index + 1024] = UnalignedVec4(addr).value;
}

Pointer or offsets?

Using raw pointers is not always the best idea. A natural use case you could consider for pointers is that you have tree structures or list structures in GPU memory. With pointers, you can jump around as much as you want, and even write new pointers to buffers. However, a pointer is 64-bit and a typical performance consideration is to use 32-bit offsets (or even 16-bit offsets) if possible. Using offsets is the way to go if you can guarantee that all buffers live inside a single VkBuffer. On the other hand, the pointer approach can access any VkBuffer at any time without having to use descriptors. Therein lies the key strength of BDA.

Extreme hackery: physical pointer as specialization constants

This is a life saver in certain situations where you are desperate to debug something without any available descriptor set.

A black magic hack is to place a BDA inside a specialization constant. This allows for accessing a pointer without using any descriptors. Do note that this breaks all forms of pipeline caching and is only suitable for debug code. Do not ship this kind of code. Perform this dark sorcery at your own risk:

#version 450
#extension GL_EXT_buffer_reference : require
#extension GL_EXT_buffer_reference_uvec2 : require
layout(local_size_x = 64) in;

layout(constant_id = 0) const uint DEBUG_ADDR_LO = 0;
layout(constant_id = 1) const uint DEBUG_ADDR_HI = 0;

layout(buffer_reference, std430, buffer_reference_align = 4) buffer DebugCounter
{
    uint value;
};

void main()
{
    DebugCounter counter = DebugCounter(uvec2(DEBUG_ADDR_LO, DEBUG_ADDR_HI));
    atomicAdd(counter.value, 1u);
}

Emitting SPIR-V with buffer_device_address

In SPIR-V, there are some things to note. BDA is an especially useful feature for layering other APIs due to its extreme flexibility in how we access memory. Therefore, generating BDA code yourself is a reasonable use case to assume as well.

Enables BDA in shaders.

_OpCapability PhysicalStorageBufferAddresses
OpExtension "SPV_KHR_physical_storage_buffer"_

The memory model is PhysicalStorageBuffer64 and not logical anymore.

_OpMemoryModel PhysicalStorageBuffer64 GLSL450_

The buffer reference types are declared basically just like SSBOs.

_OpDecorate %_runtimearr_v4float ArrayStride 16
OpMemberDecorate %ReadVec4 0 NonWritable
OpMemberDecorate %ReadVec4 0 Offset 0
OpDecorate %ReadVec4 Block
OpDecorate %_runtimearr_v4float_0 ArrayStride 16
OpMemberDecorate %WriteVec4 0 NonReadable
OpMemberDecorate %WriteVec4 0 Offset 0
OpDecorate %WriteVec4 Block
OpMemberDecorate %UnalignedVec4 0 NonWritable
OpMemberDecorate %UnalignedVec4 0 Offset 0
OpDecorate %UnalignedVec4 Block_

Declare a pointer to the blocks. PhysicalStorageBuffer is the storage class to use.

OpTypeForwardPointer %_ptr_PhysicalStorageBuffer_WriteVec4 PhysicalStorageBuffer
%_ptr_PhysicalStorageBuffer_ReadVec4 = OpTypePointer PhysicalStorageBuffer %ReadVec4
%_ptr_PhysicalStorageBuffer_WriteVec4 = OpTypePointer PhysicalStorageBuffer %WriteVec4
%_ptr_PhysicalStorageBuffer_UnalignedVec4 = OpTypePointer PhysicalStorageBuffer %UnalignedVec4

Load a physical pointer from PushConstant.

_%55 = OpAccessChain %_ptr_PushConstant__ptr_PhysicalStorageBuffer_WriteVec4 %registers %int_1    
%56 = OpLoad %_ptr_PhysicalStorageBuffer_WriteVec4 %55_

Access chain into it.

_%66 = OpAccessChain %_ptr_PhysicalStorageBuffer_v4float %56 %int_0 %40_

Aligned must be specified when dereferencing physical pointers. Pointers can have any arbitrary address and must be explicitly aligned, so the compiler knows what to do.

OpStore %66 %65 Aligned 16

For pointers, SPIR-V can bitcast between integers and pointers seamlessly, for example:

%61 = OpLoad %_ptr_PhysicalStorageBuffer_ReadVec4 %60
%70 = OpBitcast %v2uint %61

// Do math on %70
%86 = OpBitcast %_ptr_PhysicalStorageBuffer_UnalignedVec4 %some_address

Conclusion

We have already explored two key Vulkan extension game changers through this blog and the previous one. The third and final part of this game changer blog series will explore ‘Timeline Semaphores’ and how developers can use this new extension to improve the development experience and enhance their games.

Follow Up

Thanks to Hans-Kristian Arntzen and the team at Arm for bringing this great content to the Samsung Developers community. We hope you find this information about Vulkan extensions useful for developing your upcoming mobile games.

The Samsung Developers site has many resources for developers looking to build for and integrate with Samsung devices and services. Stay in touch with the latest news by creating a free account or by subscribing to our monthly newsletter. Visit the Marketing Resources page for information on promoting and distributing your apps and games. Finally, our developer forum is an excellent way to stay up-to-date on all things related to the Galaxy ecosystem.

View the full blog at its source

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
 Share

  • Similar Topics

    • By STF News
      View the full blog at its source
    • By STF News
      Samsung Electronics today announced that its 2022 QLED and Lifestyle TVs have been recognized by leading global certification institutes for eye safety and color technology. The news comes as the company announced its newest QLED and Lifestyle TVs at CES 2022.
       

       
      The 2022 Samsung Lifestyle TVs won the ‘Eye Care’ Certification from Verband Deutscher Elektrotechniker (VDE) in Germany, one of Europe’s largest technical-scientific associations with more than 36,000 members. The certification applies to Samsung’s 2022 Lifestyle TV models including The Frame, The Serif and The Sero. The screens are evaluated on various categories, including ‘Safety’, ‘Gentle to the eyes’, flicker level, uniformity and color fidelity.
       
      (From left) Seokwoo Yong, EVP and Head of R&D Team, Visual Display Business at Samsung Electronics and Cherif Kedir, President & CEO of VDE
       
      The new Lifestyle TVs were assessed for safety from blue light emission and melatonin inhibition levels based on a light hazard classification method set by the International Electrotechnical Commission (IEC). Samsung’s 2022 Lifestyle TVs satisfy the IEC’s standards for screen flickering, which can cause eye fatigue or headache for viewers. They were also recognized for excellence in color fidelity and picture quality uniformity, both elements of which contribute to eye comfort level while watching TV.
       
      (From left) Seokwoo Yong, EVP and Head of R&D Team, Visual Display Business at Samsung Electronics and Wyatt Brannan, Vice President of North America Consumer, Medical & Information Technology of UL
       
      Samsung’s 2022 Lifestyle TVs1 were also verified as ‘Glare-Free’ by Underwriters Laboratories (UL), a leading independent safety science company. UL’s verification validates the ‘Glare-Free’ claim by assessing the products against Unified Glare Rating (UGR) testing standard set by the International Commission on Illumination (CIE). Samsung’s new Lifestyle TV models use a new Matte Display with anti-glare, anti-reflection and anti-fingerprint properties to deliver the optimal brightness and provide the best picture quality without glare.
       
      Reflected Glare, which determines whether the objects on a TV screen are visible even when external light is reflected on the surface Discomfort Glare, which determines whether a TV screen is too bright Disability Glare, which determines whether a TV screen is overly bright when watching TV in a dark room  
      These glare assessments were calculated based on test results of watching TV in both 300 lux, which is equivalent to a brightly lit work area, and in 150 lux, which is usually the value for a dimly lit work area.
       
      (From left) Seokwoo Yong, EVP and Head of R&D Team, Visual Display Business at Samsung Electronics and Raj Shah, Vice President at Marketing of Pantone
       
      Additionally, Samsung’s all new 2022 QLED models received the world’s first ‘Pantone Validated’ certification from Pantone, the world-famous brand in the global color industry and creator of the Pantone Matching System (PMS).
       
      These models receiving this recognition from Pantone include all 20 newly released models – 15 QLED TVs in both 4K and 8K and five monitors. Samsung’s 2022 QLED TV line-up was recognized for its accurate expression of 2,030 Pantone colors and newly added 110 skin tone shades.
       
      “As TVs become more of an entertainment hub in the home, there’s an increased demand for screens with top-tier picture quality that minimize eye strain,” said Seokwoo Yong, Executive Vice President and Head of R&D Team, Visual Display Business at Samsung Electronics. “This recognition from leading global institutes validates our technology that delivers best-in-class images along with the most comfortable watching experience.”
       
       
      1 Applicable to Samsung’s 2022 Lifestyle TVs consisting of The Frame, The Serif and The Sero.
      View the full article
    • By regenitin
      Hello everyone.
      I recently purchased Samsung The Frame 2020 (QA55LS03TAKXXL) in India 2 very important observations from my end.
      I am planning to purchase the Samsung Q600A Soundbar to pair with my TV and wanted to know if my TV supports Q-Symphony. I reached out the support team and they confirmed me that the Q-Symphony function is not a feature of my TV. I wanted to know if Samsung plans to update the operating System to support Q Symphony as this feature is present in Samsung The Frame 2021 which has exact same hardware as Samsung The Frame 2020. Since the support did not have any idea i thought of reaching out to folks on this forum. I am not able to find Amazon Prime Music & Youtube Music apps on the Samsung App Store. These are supposed to be pretty necessary and widely used apps and I am not able to figure out the reason they are not in the store. I would like to know if there are any plans to launch these apps soon. Also let me know if there is any work around to get the apps installed. Thanks in advance for your inputs.
      Best Regards
      Nitin
    • By Alex
      Starting this year, Samsung's Tizen app store is no longer accessible, both to new and existing users. Last year in June, the company closed registrations and made the store available only to existing users and they could only get previously downloaded apps.

      After December 31, 2021, however, the Tizen app store is permanently closed. So in case you are using a Samsung Z series smartphone, it might be time to switch over to Android or iOS. The last Samsung Z4 phone running Tizen OS was released back in 2017 so it was kind of an expected turn of events.
      It seems like the company is dropping its Tizen project after this year's Galaxy Watch4 series is running on Google's Wear OS and all future Galaxy watches will do the same.
      Source: https://www.gsmarena.com/samsung_shuts_down_the_tizen_app_store-news-52598.php
    • By STF News
      Samsung Electronics announced that its 32-inch Odyssey Neo G8 monitor earned a Best of Innovation Award in the Gaming category at CES 2022.
       
      Samsung’s lineup received 2022 Honoree accolades across the board, including for the 55-inch Odyssey Ark, the 49-inch Odyssey Neo G9, the 32-inch Odyssey Neo G8, the 34-inch Odyssey G8, the 32-inch Smart Monitor M8 and the 32-inch High Resolution Monitor S8. This marks the ninth accolade that Samsung monitors have received at this year’s CES, setting a new record for the lineup at the world’s largest electronics show.
       
      Sponsored by the CTA, which owns and organizes CES, the CES Innovation Awards program spotlights standout examples of design and engineering across multiple consumer product categories. This marks the sixth straight year that Samsung monitors have earned CES Innovation Awards, solidifying the company’s position as the global leader in the gaming monitor market.
       
      Check out the infographic below to view this year’s complete list of award-winning monitors.
       

      View the full article
×
×
  • Create New...