Quantcast
Jump to content


New Game Changing Vulkan Extensions for Mobile: Timeline Semaphores


Recommended Posts

2021-06-28-01-banner.jpg

The Samsung Developers team works with many companies in the mobile and gaming ecosystems. We're excited to support our partner, Arm, as they bring timely and relevant content to developers looking to build games and high-performance experiences. This Vulkan Extensions series will help developers get the most out of the new and game-changing Vulkan extensions on Samsung mobile devices.

In previous blogs, we have already explored two key Vulkan extension game changers that will be enabled by Android R. These are Descriptor Indexing and Buffer Device Address. In this blog, we explore the third and final game changer, which is 'Timeline Semaphores'.

The introduction of timeline semaphores is a large improvement to the synchronization model of Vulkan and is a required feature in Vulkan 1.2. It solves some fundamental grievances with the existing synchronization APIs in Vulkan.

The problems with VkFence and VkSemaphore

In earlier Vulkan extensions, there are two distinct synchronization objects for dealing with CPU <-> GPU synchronization and GPU queue <-> GPU queue synchronization.

The VkFence object only deals with GPU -> CPU synchronization. Due to the explicit nature of Vulkan, you must keep track of when the GPU completes the work you submit to it.

vkQueueSubmit(queue, …, fence);

The previous code is the way we would use a fence, and later this fence can be waited on. When the fence signals, we know it is safe to free resources, read back data written by GPU, and so on. Overall, the VkFence interface was never a real problem in practice, except that it feels strange to have two entirely different API objects which essentially do the same thing.

VkSemaphore on the other hand has some quirks which makes it difficult to use properly in sophisticated applications. VkSemaphore by default is a binary semaphore. The fundamental problem with binary semaphores is that we can only wait for a semaphore once. After we have waited for it, it automatically becomes unsignaled again. This binary nature is very annoying to deal with when we use multiple queues. For example, consider a scenario where we perform some work in the graphics queue, and want to synchronize that work with two different compute queues. If we know this scenario is coming up, we will then have to allocate two VkSemaphore objects, signal both objects, and wait for each of them in the different compute queues. This works, but we might not have the knowledge up front that this scenario will play out. Often where we are dealing with multiple queues, we have to be somewhat conservative and signal semaphore objects we never end up waiting for. This leads to another problem …

A signaled semaphore, which is never waited for, is basically a dead and useless semaphore and should be destroyed. We cannot reset a VkSemaphore object on the CPU, so we cannot ever signal it again if we want to recycle VkSemaphore objects. A workaround would be to wait for the semaphore on the GPU in a random queue just to unsignal it, but this feels like a gross hack. It could also potentially cause performance issues, as waiting for a semaphore is a full GPU memory barrier.

Object bloat is another considerable pitfall of the existing APIs. For every synchronization point we need, we require a new object. All these objects must be managed, and their lifetimes must be considered. This creates a lot of annoying “bloat” for engines.

The timeline – fixing object bloat – fixing multiple waits

The first observation we can make of a Vulkan queue is that submissions should generally complete in-order. To signal a synchronization object in vkQueueSubmit, the GPU waits for all previously submitted work to the queue, which includes the signaling operation of previous synchronization objects. Rather than assigning one object per submission, we synchronize in terms of number of submissions. A plain uint64_t counter can be used for each queue. When a submission completes, the number is monotonically increased, usually by one each time. This counter is contained inside a single timeline semaphore object. Rather than waiting for a specific synchronization object which matches a particular submission, we could wait for a single object and specify “wait until graphics queue submission #157 completes.”

We can wait for any value multiple times as we wish, so there is no binary semaphore problem. Essentially, for each VkQueue we can create a single timeline semaphore on startup and leave it alone (uint64_t will not overflow until the heat death of the sun, do not worry about it). This is extremely convenient and makes it so much easier to implement complicated dependency management schemes.

Unifying VkFence and VkSemaphore

Timeline semaphores can be used very effectively on CPU as well:

VkSemaphoreWaitInfoKHR info = { VK_STRUCTURE_TYPE_SEMAPHORE_WAIT_INFO_KHR };
info.semaphoreCount = 1;
info.pSemaphores = &semaphore;
info.pValues = &value;
vkWaitSemaphoresKHR(device, &info, timeout);

This completely removes the need to use VkFence. Another advantage of this method is that multiple threads can wait for a timeline semaphore. With VkFence, only one thread could access a VkFence at any one time.

A timeline semaphore can even be signaled from the CPU as well, although this feature feels somewhat niche. It allows use cases where you submit work to the GPU early, but then 'kick' the submission using vkSignalSemaphoreKHR. The accompanying sample demonstrates a particular scenario where this function might be useful:

VkSemaphoreSignalInfoKHR info = { VK_STRUCTURE_TYPE_SEMAPHORE_SIGNAL_INFO_KHR };
info.semaphore = semaphore;
info.value = value;
vkSignalSemaphoreKHR(device, &info);

Creating a timeline semaphore

When creating a semaphore, you can specify the type of semaphore and give it an initial value:

VkSemaphoreCreateInfo info = { VK_STRUCTURE_TYPE_SEMAPHORE_CREATE_INFO };
VkSemaphoreTypeCreateInfoKHR type_info = { VK_STRUCTURE_TYPE_SEMAPHORE_TYPE_CREATE_INFO_KHR };
type_info.semaphoreType = VK_SEMAPHORE_TYPE_TIMELINE_KHR;
type_info.initialValue = 0;
info.pNext = &type_info;
vkCreateSemaphore(device, &info, NULL, &semaphore);

Signaling and waiting on timeline semaphores

When submitting work with vkQueueSubmit, you can chain another struct which provides counter values when using timeline semaphores, for example:

VkSubmitInfo submit = { VK_STRUCTURE_TYPE_SUBMIT_INFO };
submit.waitSemaphoreCount = 1;
submit.pWaitSemaphores = &compute_queue_semaphore;
submit.pWaitDstStageMask = &wait_stage;
submit.commandBufferCount = 1;
submit.pCommandBuffers = &cmd;
submit.signalSemaphoreCount = 1;
submit.pSignalSemaphores = &graphics_queue_semaphore;
 VkTimelineSemaphoreSubmitInfoKHR timeline = {
VK_STRUCTURE_TYPE_TIMELINE_SEMAPHORE_SUBMIT_INFO_KHR };
timeline.waitSemaphoreValueCount = 1;
timeline.pWaitSemaphoreValues = &wait_value;
timeline.signalSemaphoreValueCount = 1;
timeline.pSignalSemaphoreValues = &signal_value;
submit.pNext = &timeline;
 signal_value++; // Generally, you bump the timeline value once per submission.
 vkQueueSubmit(queue, 1, &submit, VK_NULL_HANDLE);

Out of order signal and wait

A strong requirement of Vulkan binary semaphores is that signals must be submitted before a wait on a semaphore can be submitted. This makes it easy to guarantee that deadlocks do not occur on the GPU, but it is also somewhat inflexible. In an application with many Vulkan queues and a task-based architecture, it is reasonable to submit work that is somewhat out of order. However, this still uses synchronization objects to ensure the right ordering when executing on the GPU. With timeline semaphores, the application can agree on the timeline values to use ahead of time, then go ahead and build commands and submit out of order. The driver is responsible for figuring out the submission order required to make it work. However, the application gets more ways to shoot itself in the foot with this approach. This is because it is possible to create a deadlock with multiple queues where queue A waits for queue B, and queue B waits for queue A at the same time.

Ease of porting

It is no secret that timeline semaphores are inherited largely from D3D12’s fence objects. From a portability angle, timeline semaphores make it much easier to have compatibility across the APIs.

Caveats

As the specification stands right now, you cannot use timeline semaphores with swap chains. This is generally not a big problem as synchronization with the swap chain tends to be explicit operations renderers need to take care of.

Another potential caveat to consider is that the timeline semaphore might not have a direct kernel equivalent on current platforms, which means some extra emulation to handle it, especially the out-of-order submission feature. As the timeline synchronization model becomes the de-facto standard, I expect platforms to get more native support for it.

Conclusion

All three key Vulkan extension game changers improve the overall development and gaming experience through improving graphics and enabling new gaming use cases. We hope that we gave you enough samples to get you started as you try out these new Vulkan extensions to help bring your games to life

Follow Up

Thanks to Hans-Kristian Arntzen and the team at Arm for bringing this great content to the Samsung Developers community. We hope you find this information about Vulkan extensions useful for developing your upcoming mobile games.

The Samsung Developers site has many resources for developers looking to build for and integrate with Samsung devices and services. Stay in touch with the latest news by creating a free account or by subscribing to our monthly newsletter. Visit the Marketing Resources page for information on promoting and distributing your apps and games. Finally, our developer forum is an excellent way to stay up-to-date on all things related to the Galaxy ecosystem.

View the full blog at its source

Link to comment
Share on other sites



  • Replies 0
  • Created
  • Last Reply

Top Posters In This Topic

Popular Days

Top Posters In This Topic

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Similar Topics

    • By Samsung Newsroom
      Samsung Electronics’ newest AI TVs merge entertainment with convenience and security to upscale everyday life and elevate the viewing experience.
       
      The 2024 Neo QLED 8K QN900D TV is fitted with the 8K NQ8 AI Gen3 Processor. With its 512 neural networks the chip helps upscale low-resolution content to 8K-like qualities in real-time.1 Users can experience new channels through a Samsung TV Plus2 update and experience strengthened SmartThings connectivity.3 Since security is at the core of the smart home ecosystem, the new AI TV is equipped with Samsung Knox4 to strengthen protection measures.
       
      Take a look at how the Samsung AI TV is ushering in a new era of viewing through the infographic56 below:
       

       

       

       

       

       

       
       
      1 8K AI upscaling may not be available when connected to a PC or in AI Auto Game Mode.
      2 Samsung account required for Samsung TV Plus. Supported Samsung devices and channels may vary by region and are subject to change without prior notice. Additional settings may be required to use these functions.
      3 Samsung account and additional settings required for SmartThings functions such as AI Energy Mode.
      4 Samsung Knox is applied to Samsung TVs through Tizen OS for models after 2015. To maintain security, the TV requires the latest security software updates. Protection is guaranteed for three years after the product’s release date.
      5 Image is designed for reference only. Results may vary with real use.
      6 Relumino Mode may not be available for sources not input by DTV or HDMI.
      View the full article
    • By Samsung Newsroom
      What if technology could make an everyday moment richer, more connected and more lifelike? Samsung’s latest AI TV strives to enhance the viewing experience and integrate advanced technology into people’s lives so seamlessly that they barely notice.
       
      ▲ The 8K NQ8 AI Gen3 Processor combines AI sound, AI picture and AI optimization features for a frictionless viewing experience.
       
      Samsung’s AI TV, 2024 Neo QLED 8K QN900D, is equipped with the 8K NQ8 AI Gen3 Processor, the company’s most powerful processor to date, as well as a neural processing unit (NPU) that runs twice as fast as its predecessor. Featuring eight times as many neural networks — 512 instead of 64 — the new AI TV analyzes and optimizes content in real time, delivering superior picture and sound quality and elevating the TV experience to one that places viewers at the center.
       
       
      Full Immersion With AI Picture Technology
      Watching a tennis match with the Samsung AI TV is like getting courtside seats. Every serve and every volley come to life right in the living room.
       

       
      This immersive experience is possible thanks to 8K AI Upscaling Pro and AI Motion Enhancer Pro which work together to deliver exceptional clarity by sharpening low-resolution content and minimizing ball distortion and blurring.
       
      ▲ AI Motion Enhancer Pro uses deep learning to show the precise movements of a soccer ball during a match.
       
       
      Revolutionizing Audio
      Lifelike audio is equally important — and with AI guiding the experience, dialogue is crystal clear even in loud surroundings. Active Voice Amplifier Pro distinguishes between voices and background noise, cutting through the commotion so viewers hear only what’s important.
       
      ▲ Active Voice Amplifier Pro ensures that dialogue is clear and audible.
       
      When watching a movie with Object Tracking Sound (OTS) Pro, viewers are no longer spectators — they’re in the character’s shoes, hearing everything from all directions.
       
      ▲ OTS Pro puts viewers in the scene so they hear what the characters hear.
       
       
      Effortless Fine-Tuning
      With AI Optimization, viewers can sit back and relax. This feature automatically fine-tunes the TV’s settings for the best viewing experience. Likewise, gamers don’t have to worry about adjusting the picture or sound with AI Auto Game Mode, which recognizes game titles and genres and automatically optimizes settings for an upgraded experience.
       
      ▲ AI Auto Game Mode allows users to focus on their gameplay.
       
      AI Energy Mode conserves energy by using sensors to analyze ambient lighting before automatically adjusting the AI TV screen’s brightness. Similarly, the TV’s processor can identify on-screen motion and alter screen brightness. This feature represents a step toward greater sustainability.
       
      ▲ Samsung Tizen OS transforms the AI TV into a personal entertainment center.
       
      Meanwhile, Samsung Tizen OS turns the AI TV into more than an entertainment center. A personal curator, the operating system tailors content to users’ preferences while protecting their privacy with Samsung Knox.
       
      ▲ Samsung Tizen OS transforms the AI TV into a personal entertainment center.
       
      At its best, technology that enriches lives sits in the background — present, but not intrusive. Samsung’s AI TV continues to evolve to do just that by pushing the frontier of audio and visual experiences, personalized experiences and enhanced sustainability while putting AI to work under the hood.
      View the full article
    • By Samsung Newsroom
      Start Date May 28, 2024 - May 28, 2024
      Location 그래비티 서울 판교, 스페이스 볼룸 B1
      View the full blog at its source
    • By Samsung Newsroom
      Samsung Electronics, a global leader in the display industry, has secured the top position in global sales of OLED monitors just one year after launching its first OLED model — the 34″ Odyssey OLED G8 (G85SB model), a gaming monitor.
       
      According to the International Data Corporation (IDC), Samsung Electronics has taken the top position in the global OLED monitor market by capturing 34.7% of market share based on total revenue, and the top position in market share based on sales volume with 28.3% of OLED monitors sold in 2023.1
       
      “The OLED monitor market is highly competitive, so reaching the top spot requires unparalleled innovation and product quality,” said Hoon Chung, Executive Vice President of Visual Display Business at Samsung Electronics. “This achievement speaks to our drive for excellence and understanding of consumer needs, the key factors in producing outstanding OLED monitors for performance-demanding gamers around the globe.”
       
      Samsung has also maintained its leadership in the overall global gaming monitor market for the fifth consecutive year, recording a market share of 20.8% in terms of total revenue.2
       
      Since entering the OLED market, Samsung has continued to innovate and receive praise for new monitors, including the Odyssey OLED G9 (G95SC model), which received widespread acclaim from experts and reviewers worldwide.

       
      At CES 2024, Samsung announced an expansion of its OLED monitor lineup, unveiling three new products:
       
      The Odyssey OLED G8 (G80SD model), with a 32” 4K UHD resolution, a 16:9 aspect ratio, 240Hz refresh rate and 0.03ms response time (GtG)3 The Odyssey OLED G6 (G60SD model), with a 27” QHD resolution, a 16:9 aspect ratio, 360Hz refresh rate and 0.03ms response time (GtG) An updated Odyssey OLED G9 (G95SD model), with 49” dual QHD resolution in a 32:9 aspect ratio, a 240Hz refresh rate and 0.03ms response time (GtG) and new features  
      The new OLED offerings have impressed early reviewers, and have already won awards. At CES — the most powerful tech event in the world — the Odyssey OLED G9 was named a CES® 2024 Innovation Awards Honoree.4
       
      Samsung will continue to diversify its gaming monitor lineup by introducing new Odyssey OLED models, each of which will leverage Samsung’s proprietary OLED technology. This innovation follows the success of the Odyssey Neo series with Quantum Mini LED technology, as well as the Odyssey Ark, which showcased a groundbreaking interface and form factor.
       
      For more information on Samsung’s industry-leading monitor lineups, please visit www.samsung.com.
       

       

       
       
      1 IDC Q4 2023 Worldwide Quarterly Gaming Tracker
      2 IDC Q4 2023 Worldwide Quarterly Gaming Tracker, Gaming monitor classification is based on IDC criteria (over 144Hz since 2023 2Q, over 100Hz prior to that), Value Based.
      3 Gray to gray, a unit of measurement for how long it takes for a pixel to go from one gray level to the next.
      4 The CES Innovation Awards are based upon descriptive materials submitted to the judges. The Consumer Technology Association (CTA) did not verify the accuracy of any submission or of any claims made and did not test the item to which the award was given.
      View the full article
    • By OmeJuup
      An app that I would like to download and use (CAIWAY WebTV) is only supported by Tizen OS 5.0, but my Samsung TV is running on Tizen 4.0, so the app cannot be found. I updated the TV software to the latest version, but this does not update the Tizen OS itself. Is there a way to update the Tizen OS itself?





×
×
  • Create New...