Quantcast
Jump to content


New Vulkan Extensions for Mobile: Legacy Support Extensions


Recommended Posts

2021-06-21-01-banner.jpg

The Samsung Developers team works with many companies in the mobile and gaming ecosystems. We're excited to support our partner, Arm, as they bring timely and relevant content to developers looking to build games and high-performance experiences. This Vulkan Extensions series will help developers get the most out of the new and game-changing Vulkan extensions on Samsung mobile devices.

Android is enabling a host of useful new Vulkan extensions for mobile. These new extensions are set to improve the state of graphics APIs for modern applications, enabling new use cases and changing how developers can design graphics renderers going forward. I have already provided information about ‘maintenance extensions’. However, another important extension that I explore in this blog is ‘legacy support extensions’.

Vulkan is increasingly being used as a portable “HAL”. The power and flexibility of the API allows for great layered implementations. There is a lot of effort spent in the ecosystem enabling legacy graphics APIs to run efficiently on top of Vulkan. The bright future for driver developers is a world where GPU drivers only implement Vulkan, and where legacy APIs can be implemented on top of that driver.

To that end, there are several features which are generally considered backwards today. They should not be used in new applications unless absolutely required. These extensions exist to facilitate old applications which need to keep running through API translation layers such as ANGLE, DXVK, Zink, and so on.

VK_EXT_transform_feedback

Speaking the name of this extension causes the general angst level to rise in a room of driver developers. In the world of Direct3D, this feature is also known as stream-out.

The core feature of this extension is that whenever you render geometry, you can capture the resulting geometry data (position and vertex outputs) into a buffer. The key complication from an implementation point of view is that the result is ordered. This means there is no 1:1 relation for input vertex to output data since this extension is supposed to work with indexed rendering, as well as strip types (and even geometry shaders and tessellation, oh my!).

This feature was invented in a world before compute shaders were conceived. The only real method to perform buffer <-> buffer computation was to make use of transform feedback, vertex shaders and rasterizationDiscard. Over time, the functionality of Transform Feedback was extended in various ways, but today it is essentially obsoleted by compute shaders.

There are, however, two niches where this extension still makes sense - graphics debuggers and API translation layers. Transform Feedback is extremely difficult to emulate in the more complicated cases.

Setting up shaders

In vertex-like shader stages, you need to set up which vertex outputs to capture to a buffer. The shader itself controls the memory layout of the output data. This is unlike other APIs, where you use the graphics API to specify which outputs to capture based on the name of the variable.

Here is an example Vulkan GLSL shader:

#version 450

layout(xfb_stride = 32, xfb_offset = 0, xfb_buffer = 0, location = 0)
out vec4 vColor;
layout(xfb_stride = 32, xfb_offset = 16, xfb_buffer = 0, location = 1)
out vec4 vColor2;

layout(xfb_buffer = 1, xfb_stride = 16) out gl_PerVertex {
    layout(xfb_offset = 0) vec4 gl_Position;
};

void main()
{
	gl_Position = vec4(1.0);
	vColor = vec4(2.0);
	vColor2 = vec4(3.0);
}

The resulting SPIR-V will then look something like:

Capability TransformFeedback
ExecutionMode 4 Xfb
Decorate 8(gl_PerVertex) Block
Decorate 10 XfbBuffer 1
Decorate 10 XfbStride 16
Decorate 17(vColor) Location 0
Decorate 17(vColor) XfbBuffer 0
Decorate 17(vColor) XfbStride 32
Decorate 17(vColor) Offset 0
Decorate 20(vColor2) Location 1
Decorate 20(vColor2) XfbBuffer 0
Decorate 20(vColor2) XfbStride 32
Decorate 20(vColor2) Offset 16

Binding transform feedback buffers

Once we have a pipeline which can emit transform feedback data, we need to bind buffers:

vkCmdBindTransformFeedbackBuffersEXT(cmd,
firstBinding, bindingCount,
pBuffers, pOffsets, pSizes);

To enable a buffer to be captured, VK_BUFFER_USAGE_TRANSFORM_FEEDBACK_BUFFER_BIT_EXT is used.

Starting and stopping capture

Once we know where to write the vertex output data, we will begin and end captures. This needs to be done inside a render pass:

vkCmdBeginTransformFeedbackEXT(cmd,
	firstCounterBuffer, counterBufferCount,
	pCounterBuffers, pCounterBufferOffsets);

A counter buffer allows us to handle scenarios where we end a transform feedback and continue capturing later. We would not necessarily know how many bytes were written by the last transform feedback, so it is critical that we can let the GPU maintain a byte counter for us.

vkCmdDraw(cmd, …);
vkCmdDrawIndexed(cmd, …);

Then we can start rendering. Vertex outputs are captured to the buffers in-order.

vkCmdEndTransformFeedbackEXT(cmd,
	firstCounterBuffer, counterBufferCount,
	pCounterBuffers, pCounterBufferOffsets);

Once we are done capturing, we end the transform feedback and, with the counter buffers, we can write the new buffer offsets into the counter buffer.

Indirectly drawing transform feedback results

This feature is a precursor to the more flexible indirect draw feature we have in Vulkan, but there was a time when this feature was the only efficient way to render transform feedbacked outputs. The fundamental problem is that we do not necessarily know exactly how many primitives have been rendered. Therefore, to avoid stalling the CPU, it was required to be able to indirectly render the results with a special purpose API.

vkCmdDrawIndirectByteCountEXT(cmd,
	instanceCount, firstInstance,
	counterBuffer, counterBufferOffset,
	counterOffset, vertexStride);

This works similarly to a normal indirect draw call, but instead of providing a vertex count, we give it a byte count and let the GPU perform the divide instead. This is nice, as otherwise we would have to dispatch a tiny compute kernel that converts a byte count to an indirect draw.

Queries

The offset counter is sort of like a query, but if the transform feedback buffers overflow, any further writes are ignored. The VK_QUERY_TYPE_TRANSFORM_FEEDBACK_STREAM_EXT queries how many primitives were generated. It also lets you query how many primitives were attempted to be written. This makes it possible to detect overflow if that is desirable.

VK_EXT_line_rasterization

Line rasterization is a tricky subject and is not normally used for gaming applications since they do not scale with resolution and their exact behavior is not consistent across all GPU implementations.

In the world of CAD, however, this feature is critical, and older OpenGL APIs had extensive support for quite fancy line rendering methods. This extension essentially brings back those workstation features. Advanced line rendering can occasionally be useful for debug tooling and visualization as well.

The feature zoo

typedef struct VkPhysicalDeviceLineRasterizationFeaturesEXT {
	VkStructureType sType;
	void*          		pNext;
	VkBool32       rectangularLines;
	VkBool32       bresenhamLines;
	VkBool32       smoothLines;
	VkBool32       stippledRectangularLines;
	VkBool32       stippledBresenhamLines;
	VkBool32       stippledSmoothLines;
} VkPhysicalDeviceLineRasterizationFeaturesEXT;

This extension supports a lot of different feature bits. I will try to summarize what they mean below.

Rectangular lines vs parallelogram

When rendering normal lines in core Vulkan, there are two ways lines can be rendered. If VkPhysicalDeviceLimits::strictLines is true, a line is rendered as if the line is a true, oriented rectangle. This is essentially what you would get if you rendered a scaled and rotated rectangle yourself. The hardware just expands the line along the perpendicular axis of the line axis.

In non-strict rendering, we get a parallelogram. The line is extended either in X or Y directions.

(From Vulkan specification)

Bresenham lines

Bresenham lines reformulate the line rendering algorithm where each pixel has a diamond shaped area around the pixel and coverage is based around intersection and exiting the area. The advantage here is that rendering line strips avoids overdraw. Rectangle or parallelogram rendering does not guarantee this, which matters if you are rendering line strips with blending enabled.

(From Vulkan specification)

Smooth lines

Smooth lines work like rectangular lines, except the implementation can render a little further out to create a smooth edge. Exact behavior is also completely unspecified, and we find the only instance of the word “aesthetic” in the entire specification, which is amusing. This is a wonderfully vague word to see in the Vulkan specification, which is otherwise no-nonsense normative.

This feature is designed to work in combination with alpha blending since the smooth coverage of the line rendering is multiplied into the alpha channel of render target 0’s output.

Line stipple

A “classic” feature that will make most IHVs cringe a little. When rendering a line, it is possible to mask certain pixels in a pattern. A counter runs while rasterizing pixels in order and with line stipple you control a divider and mask which generates a fixed pattern for when to discard pixels. It is somewhat unclear if this feature is really needed when it is possible to use discard in the fragment shader, but alas, legacy features from the early 90s are sometimes used. There were no shaders back in those days.

Configuring rasterization pipeline state

When creating a graphics pipeline, you can pass in some more data in pNext of rasterization state:

typedef struct VkPipelineRasterizationLineStateCreateInfoEXT {
	VkStructureType    sType;
	const void*             pNext;
	VkLineRasterizationModeEXT lineRasterizationMode;
	VkBool32                stippledLineEnable;
	uint32_t                   lineStippleFactor;
	uint16_t                   lineStipplePattern;
} VkPipelineRasterizationLineStateCreateInfoEXT;

typedef enum VkLineRasterizationModeEXT {
    VK_LINE_RASTERIZATION_MODE_DEFAULT_EXT = 0,
    VK_LINE_RASTERIZATION_MODE_RECTANGULAR_EXT = 1,
    VK_LINE_RASTERIZATION_MODE_BRESENHAM_EXT = 2,
    VK_LINE_RASTERIZATION_MODE_RECTANGULAR_SMOOTH_EXT = 3,
} VkLineRasterizationModeEXT;

If line stipple is enabled, the line stipple factors can be baked into the pipeline, or be made a dynamic pipeline state using VK_DYNAMIC_STATE_LINE_STIPPLE_EXT.

In the case of dynamic line stipple, the line stipple factor and pattern can be modified dynamically with:

vkCmdSetLineStippleEXT(cmd, factor, pattern);

VK_EXT_index_type_uint8

In OpenGL and OpenGL ES, we have support for 8-bit index buffers. Core Vulkan and Direct3D however only support 16-bit and 32-bit index buffers. Since emulating index buffer formats is impractical with indirect draw calls being a thing, we need to be able to bind 8-bit index buffers. This extension does just that.

This is probably the simplest extension we have look at so far:

vkCmdBindIndexBuffer(cmd, indexBuffer, offset, VK_INDEX_TYPE_UINT8_EXT);
vkCmdDrawIndexed(cmd, …);

Conclusion

I have been through the 'maintenance' and 'legacy support' extensions that are part of the new Vulkan extensions for mobile. In the next three blogs, I will go through what I see as the 'game-changing' extensions from Vulkan - the three that will help to transform your games during the development process.

Follow Up

Thanks to Hans-Kristian Arntzen and the team at Arm for bringing this great content to the Samsung Developers community. We hope you find this information about Vulkan extensions useful for developing your upcoming mobile games. The original version of this article can be viewed at Arm Community.

The Samsung Developers site has many resources for developers looking to build for and integrate with Samsung devices and services. Stay in touch with the latest news by creating a free account or by subscribing to our monthly newsletter. Visit the Marketing Resources page for information on promoting and distributing your apps and games. Finally, our developer forum is an excellent way to stay up-to-date on all things related to the Galaxy ecosystem.

View the full blog at its source

Link to comment
Share on other sites



  • Replies 0
  • Created
  • Last Reply

Top Posters In This Topic

Popular Days

Top Posters In This Topic

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Similar Topics

    • By Samsung Newsroom
      Samsung Electronics’ newest AI TVs merge entertainment with convenience and security to upscale everyday life and elevate the viewing experience.
       
      The 2024 Neo QLED 8K QN900D TV is fitted with the 8K NQ8 AI Gen3 Processor. With its 512 neural networks the chip helps upscale low-resolution content to 8K-like qualities in real-time.1 Users can experience new channels through a Samsung TV Plus2 update and experience strengthened SmartThings connectivity.3 Since security is at the core of the smart home ecosystem, the new AI TV is equipped with Samsung Knox4 to strengthen protection measures.
       
      Take a look at how the Samsung AI TV is ushering in a new era of viewing through the infographic56 below:
       

       

       

       

       

       

       
       
      1 8K AI upscaling may not be available when connected to a PC or in AI Auto Game Mode.
      2 Samsung account required for Samsung TV Plus. Supported Samsung devices and channels may vary by region and are subject to change without prior notice. Additional settings may be required to use these functions.
      3 Samsung account and additional settings required for SmartThings functions such as AI Energy Mode.
      4 Samsung Knox is applied to Samsung TVs through Tizen OS for models after 2015. To maintain security, the TV requires the latest security software updates. Protection is guaranteed for three years after the product’s release date.
      5 Image is designed for reference only. Results may vary with real use.
      6 Relumino Mode may not be available for sources not input by DTV or HDMI.
      View the full article
    • By Samsung Newsroom
      What if technology could make an everyday moment richer, more connected and more lifelike? Samsung’s latest AI TV strives to enhance the viewing experience and integrate advanced technology into people’s lives so seamlessly that they barely notice.
       
      ▲ The 8K NQ8 AI Gen3 Processor combines AI sound, AI picture and AI optimization features for a frictionless viewing experience.
       
      Samsung’s AI TV, 2024 Neo QLED 8K QN900D, is equipped with the 8K NQ8 AI Gen3 Processor, the company’s most powerful processor to date, as well as a neural processing unit (NPU) that runs twice as fast as its predecessor. Featuring eight times as many neural networks — 512 instead of 64 — the new AI TV analyzes and optimizes content in real time, delivering superior picture and sound quality and elevating the TV experience to one that places viewers at the center.
       
       
      Full Immersion With AI Picture Technology
      Watching a tennis match with the Samsung AI TV is like getting courtside seats. Every serve and every volley come to life right in the living room.
       

       
      This immersive experience is possible thanks to 8K AI Upscaling Pro and AI Motion Enhancer Pro which work together to deliver exceptional clarity by sharpening low-resolution content and minimizing ball distortion and blurring.
       
      ▲ AI Motion Enhancer Pro uses deep learning to show the precise movements of a soccer ball during a match.
       
       
      Revolutionizing Audio
      Lifelike audio is equally important — and with AI guiding the experience, dialogue is crystal clear even in loud surroundings. Active Voice Amplifier Pro distinguishes between voices and background noise, cutting through the commotion so viewers hear only what’s important.
       
      ▲ Active Voice Amplifier Pro ensures that dialogue is clear and audible.
       
      When watching a movie with Object Tracking Sound (OTS) Pro, viewers are no longer spectators — they’re in the character’s shoes, hearing everything from all directions.
       
      ▲ OTS Pro puts viewers in the scene so they hear what the characters hear.
       
       
      Effortless Fine-Tuning
      With AI Optimization, viewers can sit back and relax. This feature automatically fine-tunes the TV’s settings for the best viewing experience. Likewise, gamers don’t have to worry about adjusting the picture or sound with AI Auto Game Mode, which recognizes game titles and genres and automatically optimizes settings for an upgraded experience.
       
      ▲ AI Auto Game Mode allows users to focus on their gameplay.
       
      AI Energy Mode conserves energy by using sensors to analyze ambient lighting before automatically adjusting the AI TV screen’s brightness. Similarly, the TV’s processor can identify on-screen motion and alter screen brightness. This feature represents a step toward greater sustainability.
       
      ▲ Samsung Tizen OS transforms the AI TV into a personal entertainment center.
       
      Meanwhile, Samsung Tizen OS turns the AI TV into more than an entertainment center. A personal curator, the operating system tailors content to users’ preferences while protecting their privacy with Samsung Knox.
       
      ▲ Samsung Tizen OS transforms the AI TV into a personal entertainment center.
       
      At its best, technology that enriches lives sits in the background — present, but not intrusive. Samsung’s AI TV continues to evolve to do just that by pushing the frontier of audio and visual experiences, personalized experiences and enhanced sustainability while putting AI to work under the hood.
      View the full article
    • By Samsung Newsroom
      Start Date May 28, 2024 - May 28, 2024
      Location 그래비티 서울 판교, 스페이스 볼룸 B1
      View the full blog at its source
    • By Samsung Newsroom
      Samsung Electronics, a global leader in the display industry, has secured the top position in global sales of OLED monitors just one year after launching its first OLED model — the 34″ Odyssey OLED G8 (G85SB model), a gaming monitor.
       
      According to the International Data Corporation (IDC), Samsung Electronics has taken the top position in the global OLED monitor market by capturing 34.7% of market share based on total revenue, and the top position in market share based on sales volume with 28.3% of OLED monitors sold in 2023.1
       
      “The OLED monitor market is highly competitive, so reaching the top spot requires unparalleled innovation and product quality,” said Hoon Chung, Executive Vice President of Visual Display Business at Samsung Electronics. “This achievement speaks to our drive for excellence and understanding of consumer needs, the key factors in producing outstanding OLED monitors for performance-demanding gamers around the globe.”
       
      Samsung has also maintained its leadership in the overall global gaming monitor market for the fifth consecutive year, recording a market share of 20.8% in terms of total revenue.2
       
      Since entering the OLED market, Samsung has continued to innovate and receive praise for new monitors, including the Odyssey OLED G9 (G95SC model), which received widespread acclaim from experts and reviewers worldwide.

       
      At CES 2024, Samsung announced an expansion of its OLED monitor lineup, unveiling three new products:
       
      The Odyssey OLED G8 (G80SD model), with a 32” 4K UHD resolution, a 16:9 aspect ratio, 240Hz refresh rate and 0.03ms response time (GtG)3 The Odyssey OLED G6 (G60SD model), with a 27” QHD resolution, a 16:9 aspect ratio, 360Hz refresh rate and 0.03ms response time (GtG) An updated Odyssey OLED G9 (G95SD model), with 49” dual QHD resolution in a 32:9 aspect ratio, a 240Hz refresh rate and 0.03ms response time (GtG) and new features  
      The new OLED offerings have impressed early reviewers, and have already won awards. At CES — the most powerful tech event in the world — the Odyssey OLED G9 was named a CES® 2024 Innovation Awards Honoree.4
       
      Samsung will continue to diversify its gaming monitor lineup by introducing new Odyssey OLED models, each of which will leverage Samsung’s proprietary OLED technology. This innovation follows the success of the Odyssey Neo series with Quantum Mini LED technology, as well as the Odyssey Ark, which showcased a groundbreaking interface and form factor.
       
      For more information on Samsung’s industry-leading monitor lineups, please visit www.samsung.com.
       

       

       
       
      1 IDC Q4 2023 Worldwide Quarterly Gaming Tracker
      2 IDC Q4 2023 Worldwide Quarterly Gaming Tracker, Gaming monitor classification is based on IDC criteria (over 144Hz since 2023 2Q, over 100Hz prior to that), Value Based.
      3 Gray to gray, a unit of measurement for how long it takes for a pixel to go from one gray level to the next.
      4 The CES Innovation Awards are based upon descriptive materials submitted to the judges. The Consumer Technology Association (CTA) did not verify the accuracy of any submission or of any claims made and did not test the item to which the award was given.
      View the full article
    • By OmeJuup
      An app that I would like to download and use (CAIWAY WebTV) is only supported by Tizen OS 5.0, but my Samsung TV is running on Tizen 4.0, so the app cannot be found. I updated the TV software to the latest version, but this does not update the Tizen OS itself. Is there a way to update the Tizen OS itself?





×
×
  • Create New...