RTX 4090, 50-70% faster rasterization and 100% or more DLSS vs. 3090 Ti – NVIDIA Editor's Day – Part I
RTX 4090, 50-70% faster rasterization and 100% or more DLSS vs. 3090 Ti – NVIDIA Editor's Day – Part I
This article is going to be quite long (and will be updated throughout the day) and will cover the majority of what NVIDIA presented to the press at Editor's Day held at GTC 2022 on September 21, 2022.
The most interesting part for most users is the performance that the new top-of-the-range card will offer, the NVIDIA GeForce RTX 4090, which will displace the flagship card of AMP, la RTX 3090 Ti
This article includes information on the new architecture and the differences between the top three products to launch using Ada Lovelace, using the new manufacturing process of TSMC4N that you chose for your video cards:
-NVIDIA GeForce RTX 4090
-NVIDIA GeForce RTX 4080 16GB
-NVIDIA GeForce RTX 4080 12GB
We want to add that it is important to wait for specialized reviews to confirm the information that was disclosed during the editor's day by NVIDIA and its next video cards to be released.
ADA Lovelace, the new NVIDIA architecture for GeForce RTX 40 video cards
Ada Lovelace is the new architecture NVIDA or with a new advance in the three cores that are present in video cards NVIDIA GeForceRTX:
-Shader Core (traditional core used for rasterization)
-RT Core (specialized core to accelerate ray tracing, Ray Tracing)
-Tensor Core (specialized core for artificial intelligence inference, such as DLSS)
The complexity of Ray Tracing has increased over the years and the first title to use ray tracing elements (Battlefield V) used 39 operations per pixel. For comparison, the new mode to be released soon in Cyberpunk 2077 called OverdriveMode uses 635 ray tracing operations per pixel.
These numbers are on average.
Cyberpunk 2077 Overdrive Mode it will be the first game to use the entire available NVIDIA ray tracing library (SDK). The version available for the end user currently has Ultra and Psycho mode, which use a hybrid rendering mode, with traditional rasterization technique with ray traced elements.
The new mode uses NVIDIA RTX Direct Brightness (RDXDI) and NVIDIA Real-Team Denoiser (NRD) which is responsible for rendering all the effects present in the game, without the need to program individually by the developer.
To get an idea, the new mode Overdrive demands twice the amount of rays per pixel compared to the mode Psycho currently available to the public.
The relevance of describing the new way of Cyberpunk 2077 is the demonstration of the new changes that Ada Lovelace brings to the end user:
-Augmentation of CUDA shaders/cores and execution.
-Improvements in Ray Tracing (new generation of RT Cores)
-DLSS 3.0 (new generation of Tensor Cores).
NVIDIA showed what appears to be the full chip ADA Lovelace AD102 to show the progress of its architecture (a probable RTX 4090 Ti or the return of a TITAN RTX ) with up to 144 SMS (streaming multiprocessors) compared to the RTX 3090 Ti what did he bring 84 SMS. This has increased the number of CUDA cores from 10752 to 18432, as well as the increase in RT and Tensor cores.
Outside of architectural advances, the possibility of increasing transistors comes thanks to the new manufacturing process of TSMC 4N which increases the number of trillions of transistors, 28 (RTX 3090 Ti) to 76 (full chip ADA Lovelace).
The new manufacturing process has allowed NVIDIA increase clock speeds, from 1.9 GHz to 2.5 GHz while maintaining power consumption 450w of the TGP (Full Graphics Power).
Innovations in RTX (new generation of RT and Tensor cores)
The changes regarding RTX (the sum of technologies related to ray tracing and artificial intelligence inference) come in nuevas technologies added to the video cards GeForce RTX 40 based on A.D.A. Lovelace.
-Shader Execution Reordering
-Displaced Micro-Meshes
-Opacity Micro-Masks
-FP8 Inferencing
-Optical Flow Accelerator
-DLSS 3.0
Shader Execution Reordering fixes inefficiencies that occur in the GPU pipeline when rendering games with ray tracing (problem that does not occur in games with only rasterization) improving a 20 to 40% performance based on tests NVIDIA compared to not opting for this technology.
En AMPERE (RTX 30 series), the geometry part in ray tracing (BWH) I had to have the complete information (each triangle) which consumed a lot of resources for each scene. So a rasterization technique has been added to the GPU pipeline (to be specific, in ADA Lovelace's RT core) called Displaced Micro-Meshes.
In the second generation of RT cores, it is necessary for the object to have all the complex triangle information for the object/surface, since it does not know how to interpret that information in a simple way, creating an overload.
With the third generation RT cores, the RT core knows how to interpret ray tracing information with geometric objects with simplified information, increasing efficiency and eliminating previous overhead.
In short, occupies requires less information, therefore it occupies less space and the end result of rendering an object using ray tracing is faster, giving a higher framerate.
Innovations in ADA RTX
This technology is available not only for ray tracing as it also applies to full raster games, so it will be up to the developer to implement, but it will be available to developers in different programs.
Opacity Microphone Maps simplifies communication between SM as RT Colors since the new generation knows how to interpret alpha masks using ray tracing, making it easy to render a scene in real time without having to go back to the SM, increasing performance.
With this new option, developers will have an easier time making complex scenes, because previously the lack of interpretation of lightning with textures was a challenge and impact in terms of performance for video cards. Scenes with smoke particles will greatly benefit from this new technology.
Fourth generation of RT cores (Optical Flow Accelerator and DLSS 3)
With the fourth generation of Tensor cores, the new transformation engine of 8 bit floating point (FP8 Transformer Engine) and therefore, inference from FP8, adapting the new FP8 format that will be the industry standard related to ML (machine learning) and AI.
The end result is DLSS 3.0, a framerate improvement that NVIDIA promises to be substantial over DLSS 2.0, as it adds new elements to improve performance.
One of the key components is the AI Frame Generation (full frame generation by artificial intelligence, apart from super scaling by AI). With ADA Lovelace comes a new hardware unit called Optical Flow Accelerator, which helps speed up the whole process.
The official name by NVIDIA and the new option inside DLSS-3.0 to improve the framerate it is called DLSS Frame Generation.
With DLSS Frame Generation, it alternates rendered frames traditionally with fully generated frames. The benefits of using this technology (which comes with challenges) is smoother animations, as well as can bypass bottleneck in cases where this is the problem. The introduction of frame generation can bring several problems, since there are constant changes between frame and frame.
The solution for these problems is the previously mentioned hardware drive, the Optical Flow accelerator which comes in Tensor cores which looks for and analyzes the changes between frame and frame (which pixels are different between two frames).
This in combination with motion vector information from the game engine is fed back to the AI core to generate the 100% alternate frame generated by the Tensor cores (DLSS Frame Generation).
The combination of all this, DLSS 3.0 it also allows to improve performance in scenarios where the limiter is the processor and NVIDIA demonstrated it in Microsoft Flight SimulatorUsing DLSS 3.0 way performance with DLSS Frame Generation.
Summary and extra information of DLSS 3.0
The new option of DLSS Frame Generation (which increases more FPS over traditional super scaling) will only be available on new video cards NVIDIA GeForce RTX 40. The diagram shows what options each previous generation of video cards released by NVIDIA. The company promised that AI rescaling improvements will continue to improve and older generation video cards will benefit.
Hardware level support of options within DLSS 3.0
As additional information about the support of DLSS 3.0 on older RTX cards (to be specific DLSS Frame Generation) the representative of NVIDIA He did not deny that it could be implemented in the future (nor did he state) although he did stress that there would be a big challenge due to the lack of hardware (Optical Flow Accelerator). In this supposed case, the improvements would not be on par with video cards based on Ada Lovelace.
As well, NVIDIA reflex it will be a necessary requirement to reduce the total system latency time, which will have to be implemented by the game developer.
For developers, the migration of DLSS-2.0 a 3.0 should be relatively easy, as long as they add Reflex Markers to your game (in case they don't have it).
More than 35 games announced that will have DLSS 3 adoption
Finally, the performance of games using DLSS 3 of current games in the following graph. The benefit of DLSS 3 will be greater in those games with more ray tracing effects (like the upcoming Cyberpunk 2077 Overdrive mode) because DLSS Frame Generation.
NVIDIA GeForce RTX 4090, 4080 16GB and 12GB Performance vs RTX 3090 Ti
NVIDIA offered a quick look at what the new video cards based on the new ADA Lovelace architecture will offer. One of the most radical changes offered by the video card are new technologies that leverage hardware changes with RT Cores and Tensor Cores, both used in Ray Tracing and DLSS (super scaling using artificial intelligence).
I have to tell you to always take company performance numbers with a grain of salt and wait for official media reviews to validate the claims, but let's get started.
According to the video conference, in games like Assassin's Creed: Valhalla y The Division 2, games that don't have DLSS (only rasterization) NVIDIA measured the GeForce RTX 4090's performance improvement to be 50-70% more compared to the RTX 3090 Ti.
It should be noted that, during the presentation, NVIDIA emphasized that the 4080GB RTX 12 (we assume in rasterization and some other scenarios) will be of similar performance to the GeForce RTX 3090 Ti). The 16GB version will offer more performance (more on that later) and at the top is the RTX 4090, which will be on a larger scale above all.
Warhammer 40,000: Darktide was the first title mentioned to be compared to DLSS 3.0, which will be one of the big improvements with ADA Lovelace. The RTX 4090 offers 100% more performance than the GeForce RTX 3090 Ti (using DLSS 2.0) in this particular title, which is one of the generational changes offered by NVIDIA's new ADA Lovelace architecture.
Another interesting title and in which it was demonstrated was Microsoft Flight Simulator. It's hard to get high FPS because the game is CPU-limited, but with the new developments of DLSS 3.0, the bottleneck in this kind of titles is (if not completely, partially) gone.
La GeForce RTX 4080 de 12GB, 16GB y RTX 4090 Offers up to 2x the FPS performance using DLSS 3.0 compared to the RTX 3090 Ti.
About content creation
One of the most relevant things yesterday in the presentation, are the performance improvements and new tools that will come in the future for content creators (as well as updates). Those using Arnold in Maya will get better performance with the RTX 4080 de 12GB compared to the RTX 3090 Ti and even twice with the RTX4090.
The same happens with V-Ray y Octane rendering, seeing improvements of up to 100% more, although the company did not indicate under which task (we assume in shorter render times).
Content creation will be one of the strengths of the new series Nvidia Geforce RTX 40, something that we will talk more about in the second part of DLSS 3.0 together.
Future Games – More Ray Tracing and DLSS 3.0 Leverage
One of the things that was on display during the day is NVIDIA's emphasis on pushing game developers to pursue new heights using Ray Tracing in real time in video games. With the introduction of Ray Tracing in the GeForce RTX 20 series, there were several who said that it did not work, but it was the same push by the company and bringing greater fidelity to users, which has pushed game developers to adopt this new standard. in AAA games.
Although the original and valid criticism of the company, in showing options (Ray Tracing) with Turing in things that were not yet available to the public, it has learned from its mistakes.
In the near and not so near future, the new fence rises and games with heavier Ray Tracing elements will come. NVIDIA called this next generation gaming using Ray Tracing and DLSS 3.0.
To demonstrate this, we used of demos:
-Portal with RTX (based on the popular Valve game, coming soon as DLC)
-Racer RTX (demo with the new options that game developers can use in their productions).
All this is possible due to the improvements that come with the Ada Lovelace architecture and its changes in the performance that it offers in Leveraged Ray Tracing with DLSS 3.0 compared to previous generations.
The first title that is quite advanced is the new mode of Cyberpunk 2077, called Cyberpunk 2077 RT Overdrive. Compared to what is currently available, this update offers more effects in Ray Tracing.
Using DLSS 3.0 (DLSS Performance) there is up to a 4x improvement compared to the GeForce RTX 3090 Ti (also using DLSS performance) in the new Cyberpunk mode.
NVIDIA GeForce RTX 40 series pricing and launch
To reiterate again from information that has already been public since NVIDIA CEO Jensen Huang's presentation, we have more information on the new ADA Lovelace, RTX 4000 series video cards.
Founders Edition only on RTX 4090 and RTX 4080 16GB
This time, NVIDIA will sell directly to the end user the NVIDIA GeForce RTX 4090 24GB and GeForce RTX 4080 16GB video cards at its presentation (Founders Edition). The GeForce RTX 4080 12GB will only be offered through authorized partners, something that AIBs take quite warmly in the US. The Founders Edition competed directly with them, being at a lower price.
I repeat, the exclusivity of the Founders Edition will only be present in these two models (RTX 4090 24GB and 4080 16GB).
The price of the NVIDIA GeForce RTX 4090 24GB will be from 1599 US $ and will launch October 12. The GeForce RTX 4080 16GB will also be available in your presentation of Founders Edition and will have a price from 1199 US $ and finally, the GeForce RTX 4080 12GB will start at US$899 exclusively on graphics cards from partners (AIBs).
All prices listed are MSRP US$ in the United States.
The two versions of the GeForce RTX 4080 video card will be available from November.
Bonus: Q&A Session with NVIDIA CEO Jensen Huang
There was a question and answer (Q&A) session with the NVIDIA CEO Jensen Huang related to GTC 2022 (all gaming ads, AI, etc). We did not consult (first time in a Q&A directly with the CEO) but the executive editor of PCWorld.comGordon Mahung asked a question relevant to video card prices GeForce RTX 40 series and public feedback (that they "feel" more expensive).
Jensen Huang, CEO of NVIDIA
The response / opinion of the CEO of NVIDIA in particular, were two:
-The cost of wafers have gone up (this has been previously documented in the industry).
-That if one compares the performance of what the RTX 4080 12GB scaled offers at a price of 699 US$ (which was the MSRP of the RTX 3080) one will find a better value/performance in the new RTX 4080 (12GB) .
I hope I didn't misunderstand your answer (or remember it incorrectly) but we don't have the replay to double check.
Still, there are several points that the second part of the answer can be taken from, since there are currently several uses for gaming video cards.
-Full rasterization.
-Ray tracing and DLSS.
-Creation of content (light and demanding professional level).
This will finally be analyzed once the reviews of the new video cards come out of embargo and an analysis of cost per frame and performance in the different uses that a user can do.
The article will get an update later in the day and we'll be talking more about DLSS 3.0 and content creation in part two. What does DLSS 3.0 offer and why is it limited to just ADA Lovelace and other details behind this controversial issue.
For more PC news, visit the following link.