rickrj
Ei mãe, 500 pontos!
- Mensagens
- 4.361
- Reações
- 2.467
- Pontos
- 984
On NVIDIA's Tile-Based Rendering
Looking back on NVIDIA's GDC presentation, perhaps one of the most interesting aspects approached was the implementation of tile-based rendering on NVIDIA's post-Maxwell architectures. This has been an adaptation of typically mobile approaches to graphics rendering which keeps their specific needs for power efficiency in mind - and if you'll "member", "Maxwell" was NVIDIA's first graphics architecture publicly touted for its "mobile first" design.
This approach essentially divides the screen into tiles, and then rasterizes the entire frame in a per-tile basis. 16×16 and 32×32 pixels are the usual tile sizes, but both Maxwell and Pascal can dynamically assess the required tile size for each frame, changing it on-the-fly as needed and according to the complexity of the scene. This looks to ensure that the processed data has a much smaller footprint than that of the full image rendering - small enough that it makes it possible for NVIDIA to keep the data in a much smaller amount of memory (essentially, the L2 memory), dynamically filling and flushing the available cache as possible until the full frame has been rendered. This means that the GPU doesn't have to access larger, slower memory pools as much, which primarily reduces the load on the VRAM subsystem (increasing available VRAM for other tasks), whilst simultaneously accelerating rendering speed. At the same time, a tile-based approach also lends itself pretty well to the nature of GPUs - these are easily parallelized operations, with the GPU being able to tackle many independent tiles simultaneously, depending on the available resources.
Thanks to NVIDIA's public acknowledgement on the usage of tile-based rendering strating with its Maxwell architectures, some design decisions on the Maxwell architecture now make much more sense. Below, is a screenshot taken from NVIDIA's "5 Things You Should Know About the New Maxwell GPU Architecture". Take a look at the L2 cache size. From Kepler to Maxwell, the cache size increased 8x, from 256 KB on Kepler to the 2048 KB on Maxwell. Now, we can attribute this gigantic leap in cache size to the need for a higher-size L2 cache so as to fit the required tile-based resources for the rasterizing process, which allowed NVIDIA the leap in memory performance and power efficiency they achieved with the Maxwell architecture compared to its Kepler predecessor. Incidentally, NVIDIA's GP102 chip (which powers the GTX Titan X and the upcoming, recently announced GTX 1080 Ti, doubles that amount of L2 cache again, to a staggering 4096 KB. Whether or not Volta will continue with the scaling of L2 cache remains to be seen, but I've seen worse bets.
An interesting tangent: the Xbox 360 and Xbox One ESRAM chips (running on AMD-architectured GPUs, no less) can make for a substitute for the tile-based rasterization process that post-Maxwell NVIDIA GPUs employ.
Tile-based rendering seems to have been a key part on NVIDIA's secret-sauce towards achieving the impressive performance-per-watt ratings of their last two architectures, and it's expected that their approach to this rendering mode will only improve with time. Some differences can be seen in the tile-based rendering between Maxwell and Pascal already, with the former dividing the scene into triangles, and the later breaking a scene up into squares or vertical rectangles as needed, so this means that NVIDIA has in fact put in some measure of work into the rendering system between both these architectures.
Perhaps we have already seen some seeds of this tile-based rendering on AMD's Vega architecture sneak peek, particularly in regards to its next-generation Pixel Engine: the render back-ends now being clients of the L2 cache substitute their previous architectures' non-coherent memory access, in which the pixel engine wrote to the memory controller. This could be AMD's way of tackling the same problem, with AMD's improvements to the pixel-engine with a new-generation draw-stream binning rasterizer supposedly helping to conserve clock cycles, whilst simultaneously improving on-die cache locality and memory footprint.
David Kanter, of Real World Tech, has a pretty interesting YouTube video where he goes in some depth on NVIDIA's tile-based approach, which you can check if you're interested.
Source: NVIDIA Devblogs, Real World Tech
NVIDIA Announces DX12 Gameworks Support
by R-T-B Wednesday, March 1st 2017 01:02 Discuss (23 Comments)
NVIDIA has announced DX12 support for their proprietary GameWorks SDK, including some new exclusive effects such as "Flex" and "Flow." Most interestingly, NVIDIA is claiming that simulation effects get a massive boost from Async Compute, nearly doubling performance on a GTX 1080 using that style of effects. Obviously, Async Compute is a DX12 exclusive technology. The performance gains in an area where NVIDIA normally is perceived to not do so well are indeed encouraging, even if only in their exclusive ecosystem. Whether GCN powered cards will see similar gains when running GameWorks titles remains to be seen.
NVIDIA Announces Public Ansel SDK, Developer Plugins
by R-T-B Wednesday, March 1st 2017 01:04 Discuss (7 Comments)
NVIDIA, Ansel, a framework for doing real-time screenshot filters and photographic effects, has seen the release of a public SDK and a few developer plugins to boot. Unreal Engine and Unity have both gained plugins for the technology, and the tech is reportedly coming to Amazon's Lumberyard engine as well. This should most assuredly aid in the adoption of the technology, as well as open it up to new markets where it was previous unavailable, such as indie game development. The public SDK is presently available for download from NVIDIA directly at developer.nvidia.com/ansel
Shadowplay Now Automagically Records Your Greatest Moments
by R-T-B Today, 01:00 Discuss (2 Comments)
NVIDIA has announced a new SDK for its products known as Shadowplay Highlights. Shadowplay Highlights augments the existing recording game technology of NVIDIA Shadowplay to automatically capture hot moments in your favorite videogame. Whether it's your latest Triple Kill or a particular daring jump on the race track, if the game engine tells the SDK it's significant, Shadowplay spins up, combining previously recorded gameplay with live recordings, to create a perfect video of your glory moment. You can then edit the footage from within the game and directly upload it to a number of social networks.
The technology includes many options for quality or diskspace saving, and anything in-between. Of course, as with all things Shadowplay, the technology certainly will require a GeForce branded graphics card and support from game developers as well. A video demonstrating the technology follows after the break.
NVIDIA Announces the GeForce GTX 1080 Ti Graphics Card at $699
by btarunr Today, 00:47 Discuss (86 Comments)
NVIDIA today unveiled the GeForce GTX 1080 Ti graphics card, its fastest consumer graphics card based on the "Pascal" GPU architecture, and which is positioned to be more affordable than the flagship TITAN X Pascal, at USD $699, with market availability from the first week of March, 2017. Based on the same "GP102" silicon as the TITAN X Pascal, the GTX 1080 Ti is slightly cut-down. While it features the same 3,584 CUDA cores as the TITAN X Pascal, the memory amount is now lower, at 11 GB, over a slightly narrower 352-bit wide GDDR5X memory interface. This translates to 11 memory chips on the card. On the bright side, NVIDIA is using newer memory chips than the one it deployed on the TITAN X Pascal, which run at 11 GHz (GDDR5X-effective), so the memory bandwidth is 484 GB/s.
Besides the narrower 352-bit memory bus, the ROP count is lowered to 88 (from 96 on the TITAN X Pascal), while the TMU count is unchanged from 224. The GPU core is clocked at a boost frequency of up to 1.60 GHz, with the ability to overclock beyond the 2.00 GHz mark. It gets better: the GTX 1080 Ti features certain memory advancements not found on other "Pascal" based graphics cards: a newer memory chip and optimized memory interface, that's running at 11 Gbps. NVIDIA's Tiled Rendering Technology has also been finally announced publicly; a feature NVIDIA has been hiding from its consumers since the GeForce "Maxwell" architecture, it is one of the secret sauces that enable NVIDIA's lead.
NVIDIA's AIB Partners to Launch GTX 1080, 1060 With Faster GDDR5, GDDR5X Memory
by Raevenlord Today, 09:45 Discuss (19 Comments)
At their GDC event yesterday, NVIDIA announced a change to how partners are able to outfit their GTX 1080 and GTX 1060 6 GB models in regards to video memory. Due to improvements in process and scaled-down costs, NVIDIA has decided to allow partners to purchase 11 Gbps GDDR5X (up from 10 Gbps) and 9Gbps (up from 8 Gbps) GDDR5 memory from them, to pair with the GTX 1080 and GTX 1060 6 GB, respectively. These are to be sold by NVIDIA's AIB partners as overclocked cards, and don't represent a change to the official specifications on either graphics card. With this move, NVIDIA aims to give partners more flexibility in choosing memory speeds and carving different models of the same graphics card, with varying degrees of overclock, something which was particularly hard to do on conventional 10 Gbps-equipped GTX 1080's, which showed atypically low memory overclocking headroom.
Retirado tudo do TECHPOWERUP
Sponsored Sessions
Click below on each session to find out more:
Wednesday 3/1/2017
Looking back on NVIDIA's GDC presentation, perhaps one of the most interesting aspects approached was the implementation of tile-based rendering on NVIDIA's post-Maxwell architectures. This has been an adaptation of typically mobile approaches to graphics rendering which keeps their specific needs for power efficiency in mind - and if you'll "member", "Maxwell" was NVIDIA's first graphics architecture publicly touted for its "mobile first" design.
This approach essentially divides the screen into tiles, and then rasterizes the entire frame in a per-tile basis. 16×16 and 32×32 pixels are the usual tile sizes, but both Maxwell and Pascal can dynamically assess the required tile size for each frame, changing it on-the-fly as needed and according to the complexity of the scene. This looks to ensure that the processed data has a much smaller footprint than that of the full image rendering - small enough that it makes it possible for NVIDIA to keep the data in a much smaller amount of memory (essentially, the L2 memory), dynamically filling and flushing the available cache as possible until the full frame has been rendered. This means that the GPU doesn't have to access larger, slower memory pools as much, which primarily reduces the load on the VRAM subsystem (increasing available VRAM for other tasks), whilst simultaneously accelerating rendering speed. At the same time, a tile-based approach also lends itself pretty well to the nature of GPUs - these are easily parallelized operations, with the GPU being able to tackle many independent tiles simultaneously, depending on the available resources.
Thanks to NVIDIA's public acknowledgement on the usage of tile-based rendering strating with its Maxwell architectures, some design decisions on the Maxwell architecture now make much more sense. Below, is a screenshot taken from NVIDIA's "5 Things You Should Know About the New Maxwell GPU Architecture". Take a look at the L2 cache size. From Kepler to Maxwell, the cache size increased 8x, from 256 KB on Kepler to the 2048 KB on Maxwell. Now, we can attribute this gigantic leap in cache size to the need for a higher-size L2 cache so as to fit the required tile-based resources for the rasterizing process, which allowed NVIDIA the leap in memory performance and power efficiency they achieved with the Maxwell architecture compared to its Kepler predecessor. Incidentally, NVIDIA's GP102 chip (which powers the GTX Titan X and the upcoming, recently announced GTX 1080 Ti, doubles that amount of L2 cache again, to a staggering 4096 KB. Whether or not Volta will continue with the scaling of L2 cache remains to be seen, but I've seen worse bets.
An interesting tangent: the Xbox 360 and Xbox One ESRAM chips (running on AMD-architectured GPUs, no less) can make for a substitute for the tile-based rasterization process that post-Maxwell NVIDIA GPUs employ.
Tile-based rendering seems to have been a key part on NVIDIA's secret-sauce towards achieving the impressive performance-per-watt ratings of their last two architectures, and it's expected that their approach to this rendering mode will only improve with time. Some differences can be seen in the tile-based rendering between Maxwell and Pascal already, with the former dividing the scene into triangles, and the later breaking a scene up into squares or vertical rectangles as needed, so this means that NVIDIA has in fact put in some measure of work into the rendering system between both these architectures.
Perhaps we have already seen some seeds of this tile-based rendering on AMD's Vega architecture sneak peek, particularly in regards to its next-generation Pixel Engine: the render back-ends now being clients of the L2 cache substitute their previous architectures' non-coherent memory access, in which the pixel engine wrote to the memory controller. This could be AMD's way of tackling the same problem, with AMD's improvements to the pixel-engine with a new-generation draw-stream binning rasterizer supposedly helping to conserve clock cycles, whilst simultaneously improving on-die cache locality and memory footprint.
David Kanter, of Real World Tech, has a pretty interesting YouTube video where he goes in some depth on NVIDIA's tile-based approach, which you can check if you're interested.
Source: NVIDIA Devblogs, Real World Tech
NVIDIA Announces DX12 Gameworks Support
by R-T-B Wednesday, March 1st 2017 01:02 Discuss (23 Comments)
NVIDIA has announced DX12 support for their proprietary GameWorks SDK, including some new exclusive effects such as "Flex" and "Flow." Most interestingly, NVIDIA is claiming that simulation effects get a massive boost from Async Compute, nearly doubling performance on a GTX 1080 using that style of effects. Obviously, Async Compute is a DX12 exclusive technology. The performance gains in an area where NVIDIA normally is perceived to not do so well are indeed encouraging, even if only in their exclusive ecosystem. Whether GCN powered cards will see similar gains when running GameWorks titles remains to be seen.
NVIDIA Announces Public Ansel SDK, Developer Plugins
by R-T-B Wednesday, March 1st 2017 01:04 Discuss (7 Comments)
NVIDIA, Ansel, a framework for doing real-time screenshot filters and photographic effects, has seen the release of a public SDK and a few developer plugins to boot. Unreal Engine and Unity have both gained plugins for the technology, and the tech is reportedly coming to Amazon's Lumberyard engine as well. This should most assuredly aid in the adoption of the technology, as well as open it up to new markets where it was previous unavailable, such as indie game development. The public SDK is presently available for download from NVIDIA directly at developer.nvidia.com/ansel
Shadowplay Now Automagically Records Your Greatest Moments
by R-T-B Today, 01:00 Discuss (2 Comments)
NVIDIA has announced a new SDK for its products known as Shadowplay Highlights. Shadowplay Highlights augments the existing recording game technology of NVIDIA Shadowplay to automatically capture hot moments in your favorite videogame. Whether it's your latest Triple Kill or a particular daring jump on the race track, if the game engine tells the SDK it's significant, Shadowplay spins up, combining previously recorded gameplay with live recordings, to create a perfect video of your glory moment. You can then edit the footage from within the game and directly upload it to a number of social networks.
The technology includes many options for quality or diskspace saving, and anything in-between. Of course, as with all things Shadowplay, the technology certainly will require a GeForce branded graphics card and support from game developers as well. A video demonstrating the technology follows after the break.
NVIDIA Announces the GeForce GTX 1080 Ti Graphics Card at $699
by btarunr Today, 00:47 Discuss (86 Comments)
NVIDIA today unveiled the GeForce GTX 1080 Ti graphics card, its fastest consumer graphics card based on the "Pascal" GPU architecture, and which is positioned to be more affordable than the flagship TITAN X Pascal, at USD $699, with market availability from the first week of March, 2017. Based on the same "GP102" silicon as the TITAN X Pascal, the GTX 1080 Ti is slightly cut-down. While it features the same 3,584 CUDA cores as the TITAN X Pascal, the memory amount is now lower, at 11 GB, over a slightly narrower 352-bit wide GDDR5X memory interface. This translates to 11 memory chips on the card. On the bright side, NVIDIA is using newer memory chips than the one it deployed on the TITAN X Pascal, which run at 11 GHz (GDDR5X-effective), so the memory bandwidth is 484 GB/s.
Besides the narrower 352-bit memory bus, the ROP count is lowered to 88 (from 96 on the TITAN X Pascal), while the TMU count is unchanged from 224. The GPU core is clocked at a boost frequency of up to 1.60 GHz, with the ability to overclock beyond the 2.00 GHz mark. It gets better: the GTX 1080 Ti features certain memory advancements not found on other "Pascal" based graphics cards: a newer memory chip and optimized memory interface, that's running at 11 Gbps. NVIDIA's Tiled Rendering Technology has also been finally announced publicly; a feature NVIDIA has been hiding from its consumers since the GeForce "Maxwell" architecture, it is one of the secret sauces that enable NVIDIA's lead.
NVIDIA's AIB Partners to Launch GTX 1080, 1060 With Faster GDDR5, GDDR5X Memory
by Raevenlord Today, 09:45 Discuss (19 Comments)
At their GDC event yesterday, NVIDIA announced a change to how partners are able to outfit their GTX 1080 and GTX 1060 6 GB models in regards to video memory. Due to improvements in process and scaled-down costs, NVIDIA has decided to allow partners to purchase 11 Gbps GDDR5X (up from 10 Gbps) and 9Gbps (up from 8 Gbps) GDDR5 memory from them, to pair with the GTX 1080 and GTX 1060 6 GB, respectively. These are to be sold by NVIDIA's AIB partners as overclocked cards, and don't represent a change to the official specifications on either graphics card. With this move, NVIDIA aims to give partners more flexibility in choosing memory speeds and carving different models of the same graphics card, with varying degrees of overclock, something which was particularly hard to do on conventional 10 Gbps-equipped GTX 1080's, which showed atypically low memory overclocking headroom.
Retirado tudo do TECHPOWERUP
Sponsored Sessions
Click below on each session to find out more:
Wednesday 3/1/2017
- 9:30-10:30 AM | Introduction to Deep Learning
- 11:00-12:00 PM | Zoom, Enhance, Synthesize! Magic Image Upscaling and Material Synthesis using Deep Learning
- 12:30-1:30 PM | Photogrammetry for Games; Art, Technology and Pipeline Integration for Amazing Worlds
- 3:30-4:30 PM | Watch Dogs 2 - PC Version Success Story with NVIDIA
- 5:00-6:00 PM | DirectX 12 Case Studies
- 10:00-11:00 AM | D3D Async Compute for Physics: Bullets, Bandages, and Blood
- 11:30-12:30 PM | Game Physics on the GPU with NVIDIA PhysX 3.4
- 2:00-2:50 PM | Take advantage of NVIDIA Ansel photo mode and GeForce platform features
- 3:00-3:30 PM | NVIDIA Aftermath: A new way of debugging crashes on the GPU
- 4:00-5:00 PM | NVIDIA GameWorks Animation Technologies in Unreal Engine 4
- 5:30-6:30 PM | Real-Time Rendering Advances from NVIDIA Research
- 10:00-11:00 AM | NVIDIA Vulkan Update / Vulkan GPU Work Creation
- 11:30-12:00 PM | How to Stream Your Game to Millions with GeForce NOW
- 12:15-1:15 PM | Accelerating your VR Games with VRWorks
- 1:30-2:30 PM | The Witness on Android - Post Mortem
- 3:00-4:00 PM | VR Best Practices: Putting the fun in VR Funhouse
Ultima Edição: