NVIDIA GeForce RTX 5090 Founder's Edition Review - Top of the Line Blackwell
The embargo is finally lifted NVIDIA GeForce RTX 5090 and I have in my hands the Founders Edition, accompanied by its new eco-friendly packaging. Due to time constraints (less than 24 hours to test the product), I will leave the video format I had planned for this launch pending, but there is always the old reliable web xanxogaming.comThis review will be a bit different than I would have liked – less than 24 hours – so I will try to cover as much as possible. In short, the structure will be:
- Raster test (RTX 5090 vs. eight GeForce RTX graphics cards in 14 games)
- DLSS 4 (CNN Model versus Transformers)
- Multi Frame Generation (Test on Alan Wake 2 and Cyberpunk 2077)
- Consumption (mamma mia!)
There will be more details to come, but I'll start with a quick analysis.
Changes in the NVIDIA GeForce RTX 5090 (GB202), without having yet done any testing and seen at a “macro” level, they are the following:
- 25% price increase (launch MSRP).
- 33% more VRAM compared to the previous generation (32 GB GDDR7 vs. 24 GB GDDR6X).
- Upgrade to PCIe Gen 5.0.
- Increase in TGP (consumption) of 27.77% (575 W vs. 450 W).
- 20.83% more transistors (92.2 billion vs. 76.3 billion).
- GPU chip size 23.25% larger than AD102.
- The same manufacturing process is maintained (TSMC 4 nm 4N NVIDIA Custom Process).
- Despite being the same “process node”, there are changes in the architecture (SM, RT cores, Tensor cores, etc).
I'll start with the technical information and introduction of the new architecture, courtesy of Luis Padilla. Those who wish to see what was presented during the NVIDIA Editor's Day, you can check the following link.
Table of Contents
RTX 5090 Blackwell architecture
For NVIDIA there is no turning back, and it has decided that the new architecture Blackwell For graphics cards, it employs more and better techniques and tools based on artificial intelligence, neural network processing, and generative AI engines. The green company seeks to further employ its expertise in business AI, without neglecting the classic improvements in base specifications, energy efficiency, compatibility with new codecs and video outputs, a new type of GDDR7 memory, among others.
Anyway, the star in Blackwell is the Deep Learning Super Sampling technology, popularly known as DLSS, which in its fourth generation promises up to twice as many frames per second as DLSS 3 or 3.5. There will also be a deeper look at “neural shaders,” which will work with AI models trained by developers to generate “approximate” images even faster than previous-generation ray tracing. We also have an update to the DLSS Ray Reconstruction technique, which reduces the number of rays needed to generate ray-traced lighting, and another group of techniques that we will see later.
Let's start with the basics, the specifications of the graphics cards that we will show below.
Graphic card | GeForce RTX 3090 | GeForce RTX 4090 | GeForce RTX 5090 |
---|---|---|---|
GPU code name | GA102 | AD102 | GB202 |
GPU Architecture | NVIDIA amps | NVIDIA Ada Lovelace | NVIDIA Blackwell |
CPGs | 7 | 11 | 11 |
TPC's | 41 | 64 | 85 |
SMS | 82 | 128 | 170 |
CUDA/SM Cores | 128 | 128 | 128 |
CUDA / GPU Cores | 10496 | 16384 | 21760 |
Tensor Cores / SM | 4 (3rd Gen) | 4 (4th Gen) | 4 (5th Gen) |
Tensor Cores / GPU | 328 (3rd Gen) | 512 (4th Gen) | 680 (5th Gen) |
RT cores | 82 (2nd Gen) | 128 (3rd Gen) | 170 (4th Gen) |
GPU Boost Clock Speed (MHz) | 1695 | 2520 | 2407 |
FP32 TFLOPS Peak (non-Tensor)^{1} | 35.6 | 82.6 | 104.8 |
FP16 TFLOPS Peak (non-Tensor)^{1} | 35.6 | 82.6 | 104.8 |
BF16 TFLOPS Peak (non-Tensor)^{1} | 35.6 | 82.6 | 104.8 |
INT32 TOPS Peak (non-Tensor)^{1} | 17.8 | 41.3 | 104.8 |
RT-TFLOPS | 69.5 | 191 | 317.5 |
FP4 Peak TFLOPS Tensor with accumulated FP32 (FP4 AI TOPS) | N/A | N/A | 1676/3352^{2} |
FP8 Tensor TFLOPS Peak with accumulated FP16^{1} | N/A | 660.6/1321.2^{2} | 838/1676^{2} |
FP8 Tensor TFLOPS Peak with accumulated FP32^{1} | N/A | 330.3/660.6.2^{2} | 419/838^{2} |
FP16 Tensor Peak TFLOPS with accumulated FP16^{1} | 142.3/284.6^{2} | 330.3/660.6^{2} | 419/838^{2} |
FP16 Tensor TFLOPS Peak with accumulated FP32^{1} | 71.2/142.4^{2} | 165.2/330.4^{2} | 209.5/419^{2} |
BF16 Tensor TFLOPS Peak with accumulated FP32^{1} | 71.2/142.4^{2} | 165.2/330.4^{2} | 209.5/419^{2} |
TF32 Tensor TFLOPS Peak^{1} | 35.6/71.2^{2} | 82.6/165.2^{2} | 104.8/209.5^{2} |
INT8 Tensor TOPS Peak^{1} | 284.7/569.4^{2} | 660.6/1321.2^{2} | 838/1676^{2} |
Frame buffer memory type and size | 24 GB GDDR6X | 24 GB GDDR6X | 32 GB GDDR7 |
Memory interface | 384-bit | 384-bit | 512-bit |
Memory Clock (Data Rate) | 19.5 Gbps | 21 Gbps | 28 Gbps |
Memory bandwidth | 936 GB / sec | 1008 GB / sec | 1792 GB / sec |
ROPs | 112 | 176 | 176 |
Pixel fill rate (Gigapixels/sec) | 189.8 | 443.5 | 423.6 |
texture units | 328 | 512 | 680 |
Texel fill rate (Gigatexels/sec) | 555.96 | 1290.2 | 1636.8 |
L1 Data Cache/Shared Memory | 10496 KB | 16384 KB | 21760 KB |
L2 data cache | 6144 KB | 73728 KB | 98304 KB |
Log file size | 20992 KB | 32768 KB | 43520 KB |
Video Engines | 1 x NVENC (7th Gen) 1 x NVDEC (5th Gen) | 2 x NVENC (8th Gen) 1 x NVDEC (5th Gen) | 3 x NVENC (9th Gen) 2 x NVDEC (6th Gen) |
TGP (Total Graphics Power) | 350 W | 450 W | 575 W |
Number of transistors | 28.3 Billion | 76.3 Billion | 92.2 Billion |
Wafer size | 628.4 mm^{2} | 608.5 mm^{2} | 750 mm^{2} |
Fabrication process | Samsung 8nm 8N custom process NVIDIA | TSMC 4nm 4N custom process NVIDIA | TSMC 4nm 4N custom process NVIDIA |
PCI Express Interface | Gen 4 | Gen 4 | Gen 5 |
La GeForce RTX 5090 will have the GPU GB202. It includes 12 graphics processing clusters (GPCs), 96 texture processing clusters, 192 streaming multiprocessors (SMs), and a 512-bit memory interface with 16 32-bit memory controllers. Each of the 12 GPCs mentioned above offers a rasterization engine, 8 texture processing clusters, 16 streaming multiprocessors, and 16 rasterization operation partitions.
In total, the GB202 GPU has 128MB of L2 cache, although 5090MB is offered for the RTX 96. This hardware will be used to manage the architecture based on “neural rendering”. The main attraction will be the use of DLSS 4, with multiple frame generation and lower latency through the use of improved RTX techniques (SLR 2) and image generation or transformation using AI.
Finally, each streaming multiprocessor in Blackwell (SM) includes 128 CUDA cores, a fourth generation RT core, Four fifth-generation tensor cores, 4 texture units, a 256KB register file, and 128KB of shared memory or L1 cache. The FP32 and INT32 cores have also been unified, allowing them to perform one of the two operations when needed. Additionally, the number of texture units has been increased from 512 on the RTX 4090 to 680 for the RTX 5090. Bilinear texel rates have increased from 1290.2 Gigabytes per second on the 4090 to 1636.76 Gigabytes per second on the 5090.
One of the most striking points is, without a doubt, the debut of GDDR7 graphics memory in NVIDIA gaming GPUs. After two generations of cards with GDDR6X, NVIDIA will be launching this generation of memory with no less than 32GB for the RTX 5090, 16GB for the RTX 5080 and RTX 5070 Ti, and 12GB for the RTX 5070.
Key improvements to GDDR7 include PAM3 signaling technology that promises improved signal-to-noise ratio and doubles the density of independent channels, resulting in greater memory bandwidth and improved energy efficiency.
Fifth-generation Tensor Cores now support the FP4 data format, which requires less than half the memory thanks to a compression method with virtually zero loss in quality compared to FP16. For example, an RTX 5090 with FP4 can generate images in less than five seconds using a FLUX.dev model, which would take an RTX 4090 with FP16 around 15 seconds.
Blackwell’s fourth-generation RT cores double the throughput of ray tracing test intersections, which is a high-frequency operation when generating RT images. Therefore, ray-traced frames are expected to be generated at a higher speed. Also included in this generation is a micromapping opacity engine, which helps reduce the amount of computational calculations in alpha shaders. Other dedicated ray tracing technologies include the use of mega geometry, a triangle group intersection engine, and linear scanning spheres for rendering fine geometry objects such as hair.
“Mega geometry” aims to increase geometric detail in applications that employ ray tracing. In particular, for graphics engines such as Unreal Engine 5 that employ more advanced level-of-detail systems, it allows the use of ray tracing with greater fidelity, improving the quality of shadows, reflections and indirect lighting.
Mega geometry is expected to be available on all DirectX12, Vulkan, and OptiX 9.0 APIs, and will be available on all RTX graphics cards from the Turing architecture or the RTX 20 generation.
On the other hand, we have Shader Execution Reordering (SER) 2.0. This technology allows the GPU's parallel threads to be reorganized for greater processing efficiency. This helps lighten the load on ray tracing processes such as divergent memory access and path tracing, and send information to the tensor or shader cores. There are already games and APIs that take advantage of this technique in ray tracing, and the new version promises better results.
The AI Management Processor (AMP) is a context-programmable scheduler on the GPU that improves process scheduling in Windows, reducing the contextual load on the GPU. Using a RISC-V processor, AMP works with the Windows architecture to reduce latency and improve memory management, reducing CPU load for task scheduling and helping to reduce bottlenecks, while improving frame rates and multitasking in Windows.
And on the graphics rendering front, we finally come to DLSS 4, the crown jewel at Blackwell. NVIDIA promises multi-frame rendering with improved performance and lower memory usage than previous versions of the technology, as well as improvements to previous DLSS techniques such as frame generation, ray tracing, super resolution, and deep learning anti-aliasing.
Combining hardware, architecture, and software improvements, they promise 40% faster frame rates than DLSS 3, using 30% less video memory, and a model that only needs to run once per frame. Optical flow frame generation is now AI-driven rather than dedicated hardware, also reducing the cost of frame generation and integration. A Flip Metering system shifts frame rate logic to the display engine, allowing the GPU to improve the accuracy of display sample times.
DLSS Super Resolution (SR) boosts performance by using AI to produce higher resolution frames from a lower resolution input. DLSS samples multiple low-resolution images and uses motion data and feedback from previous frames to construct high-quality images. The final product of the transformer model is more stable, with less ghosting, more motion image detail, and improved anti-aliasing compared to previous versions of DLSS.
Ray Reconstruction (RR) improves image quality by using AI to generate additional pixels in ray-traced intensive scenes. DLSS replaces manual denoisers with an AI network trained on NVIDIA supercomputers that generates higher quality pixels between sampled rays. In ray-traced intensive content, the transformer model for RR further improves quality, especially in scenes with challenging lighting. In fact, all common artifacts of typical denoisers are significantly reduced.
Deep Learning Anti-Aliasing (DLAA) provides improved image quality using an AI-based anti-aliasing technique. DLAA uses the same super-resolution technology developed for DLSS, constructing a more realistic, high-quality image at native resolution. The result provides greater temporal stability, motion details, and smoother edges in a scene.
Neural shaders are a technology NVIDIA is looking to introduce at Blackwell, unifying traditional shaders with the use of AI in parts of the rendering process, partially at first, and believed to be fully so in the future. Tensor cores are now accessible to graphics shaders combined with scheduling optimizations in SER 2.0 (Shader Execution Reordering) so that AI graphics with neural filtering capabilities and AI models, including generative AI, can run simultaneously in next-generation games.
Neural shaders allow us to train neural networks to learn efficient approximations of complex algorithms that calculate how light interacts with surfaces, decompress super-compressed video, predict indirect illumination from limited real data, and approximate subsurface light scattering. The potential applications of neural shaders are still largely unexplored, meaning that new applications may be found.
Among the other integrated techniques are the “RTX Neural Materials”. AI is used to replace the original mathematical model of a material or texture with an approximation, promising “cinema-quality” frames at gaming-grade speeds while using less video memory and fewer graphics card resources.
Another technique is RTX Neural Texture Compression, which leverages neural networks accessed through neural shaders to compress and decompress material textures more efficiently than traditional methods. Then there is the so-called “Neural Radiance Cache” (NRC). This uses a neural shader to cache and approximate brightness information. Taking advantage of the By learning a neural network, complex lighting information can be stored and used to create high-quality global illumination and dynamic lighting effects in real time.
From there we have RTX Skin, a technique that NVIDIA developed based on subsurface scattering for rendering cinema images. This allows light passing through certain skin surfaces that are not entirely solid to be rendered subtly or intensely, depending on the requirements of the game, using ray tracing.
And RTX Neural Faces creates a rasterized face that is overlaid with a rough 3D layer using a generative AI model, which can infer natural-looking face results.
For video and encoding features, there’s support for chroma-sampled 4:2:2 video, which has lower data requirements than the 4:4:4 standard, making it suitable for generating HDR content. The ninth-generation NVENC encoder also supports higher-quality AV1 and HEVC video. The RTX 5090 graphics card supports up to three encoders and two decoders. A sixth-generation NVDEC hardware decoder is also available.
Finally, Blackwell has support for DisplayPort 2.1b with up to 80 Gbps of bandwidth. This will allow up to 165hz refresh rate in 8K resolution, and no less than 420hz in 4K resolution.
Photos - NVIDIA GeForce RTX 5090 Founders Edition
photography by Istav Nile for XanxoGaming
Synthetic benchmarks, productivity and gaming (1080p, 1440p, 2160p)
With the release and announcement of next-gen graphics cards, we have to update our benchmark (once again). The best gaming processor currently is the AMD Ryzen 7 9800X3D processor.
A statistic reminder...
AVG FPS (Average Frames Per Second):
This is the average number of frames per second during a benchmark. It represents the overall performance of the graphics card and shows how smooth a game will be on average.
- Importance: It allows you to compare overall performance between cards, but does not reflect possible drops or instabilities.
1% Percentile:
Measures the average of the lowest FPS (worst 1% of performance). Indicates performance drops and overall stability.
- Importance: It reveals how consistent the experience is. A low 1% Percentile implies potential stuttering, even if the average is high.
Relationship:
The AVG FPS shows overall speed, while the 1% Percentile reflects fluidity. Together, they offer a complete performance assessment.
The new tests are measured with MsBetweenDisplay.
Benchmarks (GPU Benchmarks – 2025)
The revamped test bench features the best processor I have in my hands, the AMD Ryzen 7 9800X3D. We used this processor, since it will generate the least bottleneck to the tested GPUs in scenarios where the limiter may be the CPU (Video Link).
The focus is aimed at achieving 100% performance of the video card, NVIDIA GeForce RTX 5090 Founders Edition. However, there are scenarios where the GPU will not scale any further at lower resolutions (1080p and even 1440p).
I'm using Windows 11 24H2, but we've disabled VBS (Virtualization-Based Security, as it takes away considerable performance in certain scenarios or causes stuttering.
CPU: AMD Ryzen 7 9800X3D (https://amzn.to/4h5d7eR)
Board: ROG STRIX B650E-E GAMING WIFI (BIOS 3057) (https://amzn.to/4abMKAY)
RAM: CORSAIR VENGEANCE RGB DDR5 RAM 32GB (2x16GB) 6000MHz CL30 AMD EXPO Intel XMP (https://amzn.to/404P6gk)
T.video (under test): NVIDIA GeForce RTX 5090 Founders Edition (Link: https://amzn.to/3Pe4Vx6)
Operating system: Windows 11 Home Edition 24H2 – VBS OFF
Liquid refrigeration: DeepCool Mystique 360
SSD: FN970 1TB M.2 2280 PCIe Gen4 x4 NVMe 1.4 (https://amzn.to/3PuXPn8)
Driver: NVIDIA Press Driver
Power supply: NZXT C1200 ATX 3.1 (https://amzn.to/3ChugT4)
3DMark Time Spy Extreme
3DMark Speed Way
Vray Benchmark 6 (CUDA) - GPU
Vray Benchmark 6 (RTX)
Blender
AI - MLPerf Client 0.5 - Inference Test
MLPerf is a set of tests created by MLCommons, a consortium that includes experts from Harvard, Stanford, NVIDIA y Google, among others. These tests evaluate the performance of Advanced GPUs, and now, with MLPerf-Client v0.5 For Windows, users can measure how their PCs and laptops they drive Generative Language Models (LLMs) – Inference in INT4.
The LLMs are fundamental in the generative artificial intelligence, but evaluating their performance on different teams can be tricky. MLPerf-Client simplifies this by generating clear and comparable results, helping to understand how popular models They perform in real tasks on the table:
- content generation
- Creative writing
- Light summary
- Moderate Summary
The benchmark uses the model Llama2-7B Meta, known for its accessibility and similarity to modern architectures. In addition, it takes advantage of technologies such as ONNXRuntime-GenAI y DirectML EP to run models on various hardware.
The tests generate two key metrics: the Average time to generate first token (a result SMALLER es BEST) measured in seconds (s) and the Average generation rate of the following tokens (a result MAYOR es BETTER) measured in tokens per second (tok/s). These metrics provide a clear view of your equipment's performance with Generative AI.
MLPerf-Client 0.5 | ||||
Testing | Metric | RTX 5090 | RTX 4090 | Percent change (RTX 4090 vs RTX 5090) |
Total | Average Time to First Token (s) | 0.084 | 0.109 | -22.94% |
Average token generation rate (tok/s) | 243.27 | 177.02 | 37.43% | |
content generation | Average Time to First Token | 0.052 | 0.068 | -23.53% |
Average token generation rate (tok/s) | 258.1 | 190.37 | 35.58% | |
Creative writing | Average Time to First Token | 0.078 | 0.094 | -17.02% |
Average token generation rate (tok/s) | 246.96 | 179.94 | 37.25% | |
Summary, Light | Average Time to First Token | 0.098 | 0.124 | -20.97% |
Average token generation rate (tok/s) | 240.9 | 174.6 | 37.97% | |
Summary, Moderate | Average Time to First Token | 0.127 | 0.178 | -28.65% |
Average token generation rate (tok/s) | 228.09 | 164.2 | 38.91% |
Gaming – Rasterization
All tests are done at the highest quality available, unless otherwise specified.
Let's see the first title, Alan wake 2.
Alan Wake 2 (1080p, 1440p, 2160p)
Game Engine: Northlight Engine
A Plague Tale: Requiem (1080p, 1440p, 2160p)
Game Engine: Proprietary
Baldur's Gate 3 (1080p, 1440p, 2160p)
Game Engine: Divinity Engine 4.0
Black Myth: Wukong (1080p, 1440p, 2160p)
Game Engine: Unreal Engine 5
Borderlands 3 (1080p, 1440p, 2160p)
Game Engine: Frostbite 3
CS2 (1080p, 1440p, 2160p)
Game Engine: Source 2
F1 24 (1080p, 1440p, 2160p)
Game Engine: EGO Engine 4.0
God of War (1080p, 1440p, 2160p)
Game Engine: Proprietary
Marvel's Spider-Man Remastered (1080p, 1440p, 2160p)
Game Engine: Proprietary
Shadow of the Tomb Raider DX 12 (1080, 1440p, 2160p)
Game Engine: Foundation
Shadow of War (1080, 1440p, 2160p)
Game engine: LithTech Jupiter EX
Star Wars: Jedi Survivor (1080p, 1440p, 2160p)
Game Engine: Unreal Engine 4
Strange Brigade DX12 + Async (1080p, 1440p, 2160p)
Game Engine: Asura
Warhammer 40,000: Space Marine 2 (1080p, 1440p, 2160p)
Game Engine: Swarm Engine
Analysis - Conventional Neural Network vs Transformer Model (NVIDIA GeForce RTX)
Note: This section was originally planned for video format, but due to time constraints, it has been adapted to text for this review.
With the launch of the series GeForce RTX 50, the model that underpins all technologies DLSS has evolved significantly. In this analysis we will explore the visual and performance differences between the traditional model of DLSS-3 based on convolutional neural networks (CNN) and the new model DLSS-4, which adopts an approach based on Transformers.
This analysis is something new for our usual content, as it focuses on a detailed visual comparison. The initial presentation may seem a bit scattered due to the amount of data involved, but everything will be clearly explained at the end.
Important Note: During the writing of this section we did not have access to a unit GeForce RTX 50, which limits the possibility of directly testing the technology Multi Frame GenerationHowever, models based on CNN y Transformers are backwards compatible with previous generation GeForce RTX cards, although the feature frame generation remains exclusive to the RTX 40 and 50.
This section seeks to provide an initial overview of the impact of this change on DLSS-4 both in visual quality and overall performance (CNN vs TM).
Performance Comparisons: CNN Model vs. TM (Transformer) Model
Finally, it is time to analyze the performance, a more objective and simpler aspect to measure when comparing the traditional model of CNN-based DLSS with the new model DLSS Transformer (TM). Our objective is to verify whether the change to the Transformer model generates any penalty in terms of FPS or frame consistency (P1).
Games used for testing
For this comparison, we used two titles with special press versions that include support for both AI models:
- Alan wake 2
- Cyberpunk 2077
DLSS Settings
- Performance comparison: It was used DLSS Quality to evaluate the impact on FPS and overall consistency.
- visual comparison: It was chosen DLSS Balanced, which uses a lower base resolution, to analyze how much visual quality improves with AI.
Alan Wake 2 (Performance - CNN vs TM)
First, we review the performance in Raster to confirm if there are significant changes compared to the public version of the game. According to previous data obtained with the RTX 4090, the results remain consistent, with a average de 62 FPS or with a P1 de 54 FPS. When testing both versions, the model Transform (TM) showed slightly lower performance in the test scene, delivering approximately a 6% less FPS than the CNN modelThis behavior suggests that the new Transformers-based approach introduces additional load to the GPU, possibly due to the increased complexity of the AI model.
By enabling DLSS Frame Generation (2X), the pattern repeats itself. The model TM re-introduces a slight performance penalty, rounding a 5% less compared to the model CNNThis reinforces the hypothesis that the new model, although more advanced, has an additional computational cost.
To assess the impact on Ray Tracing, the new one was used preset Ultra de Alan Wake 2, applying all the technologies of NVIDIA. Here, the performance difference between CNN and TM models is much less pronounced.It is worth noting that despite this slight performance penalty with TM, the combined use of DLSS, Frame Generation y Ray Tracing sigue Offering superior performance compared to Raster-only configurations.
Cyberpunk 2077 (Performance - CNN vs TM)
The other title used to compare performance between models Convolutional Neural Networks (CNN) and Transform (TM) is Cyberpunk 2077, which includes an option called DLSS Legacy to activate the CNN model.
When testing, it is again observed that CNN slightly outperforms TM in terms of performance. Without enabling frame generation, CNN delivers 7% more FPS on average. By enabling NVIDIA frame generation technology, This difference decreases to approximately 5.5%, keeping CNN with a slight advantage..
When activated Ray Tracing —or more specifically, the path tracing— along with superscaling and frame generation technologies, the FPS average en Cyberpunk 2077 still outperforms those obtained with rasterization alone. This shows that, despite the performance differences between CNN and TM, the overall impact of these technologies remains positive for gamers looking for the best visual and performance experience (smoothness and higher fidelity visual effects).
Finally, we come to visual analysis, the most complex part and the one in which I have the least experience. Here we will evaluate the qualitative differences between both technologies, which requires a more detailed approach. to capture how each model affects image quality.
Visual Comparison: CNN Model vs. TM (Transformer) Model
NOTE: At the time of publication, the video for this section is in the editing process.
Alan Wake 2 (Visual differences – CNN vs TM)
For this analysis, all captures were made at resolution 4K UHD y 120fps, with the aim of identifying visual differences between both technologies.
when comparing rasterization against the superscaling of DLSS in its base configuration (1x speed), the most notable difference is the greater image fluidity achieved with DLSS, either through CNN o TM. This is evident by generating more frames per second with superscaling.
Slowing down the playback speed to 0.25x, It is easy to identify the lower fluency at rasterization due to the lower number of frames on the screenHowever, when you pause and examine the details, rasterization offers slightly higher definition on static textures. This detail, however, goes almost unnoticed in real time., since the difference in fluidity impacts the user's visual experience more.
When activating frame generation, important observations emerge. In the first version of the Transformer model, are noticeable more visual artifacts compared to CNN, Although both present this problem. This is visible in elements such as the edges of the character's sleeve. Anderson Saga, where errors are observed in both technologies, but more pronounced with TM.
An additional example is the scene where a crow passes over Saga Anderson: The bird's feathers show more visual artifacts with the Transformer modelThese errors are specific to frame generation and are greatly reduced by disabling this feature.
However, The Transformer model presents a notable advantage over CNN when using Frame Generation. NVIDIA claimed that DLSS 4 would offer greater stability and detail in motion, something that is evident in quick scenes. For example, in a quick shot from Saga Anderson, With the CNN model, some ghosting or “warping” is perceived. which distorts the character's movement. In contrast, the Transformer model almost completely eliminates this effect, making it barely noticeable even when analyzing the scene in slow motion and frame by frame.
Finally, by activating the option of ray tracing ultra, it is important to remember that all artificially generated lighting and shadow effects are replaced by ray tracing, lachieving greater realism compared to conventional rasterization. The defects found with Super Resolution y frame generation They also appear when using this option, without significant changes in their behavior.
Cyberpunk 2077 (Visual differences – CNN vs TM)
With all effects activated—Path Tracing, Super Resolution Mode Balanced y Frame Generation—Let's start with the details in a still image. In Cyberpunk 2077, the Transformer model shows sharper definition in certain areas compared to CNN. Although it still lags behind in clarity compared to pure rasterization, the accuracy of the models has improved over time (RTX 20 series comparison).
A good example of this are the lyrics “LOVE” on the vending machine, where the Transformer model presents a more precise detail. However, it is important to remember that rasterization does not generate shadows and lighting as realistic as those offered by Ray Tracing o path tracing.
Both models, CNN and Transformer, still present artifacts, as can be seen in the silhouette of some NPCs. In addition, Problems like HUD elements, for example, the information “225M“, persist even in the Transformer model.
In terms of stabilization, Transformer takes the leadFor example, in the electrical transformers of a light pole, the CNN model shows ghosting and instability, while Transformer offers a much clearer representation.
Another interesting detail is found in the metal grilles: The Transformer model manages to define them better than CNNIf they appear less striking or different in color in the raster comparison, it is simply because real-time lighting is not present in that mode.
Finally, effects like smoke and reflections look considerably more realistic thanks to the path tracing, which adds a level of immersion that cannot be achieved with traditional rasterization techniques.
Conclusion - Conventional Neural Network vs Transformer Model
I have not yet fully explored the potential of DLSS Multi Frame Generation (visual quality) or if fewer artifacts or better stabilization will actually be noticed thanks to the Flip Metering on the new cards GeForce RTX 50. However, from my observations (I received the unit shortly before the embargo was lifted) it seems that the artifacting sigue being a present issue with the GeForce RTX 5090, which is inherent to the model itself.
What is clear is that the first version of the model Transform offers improvements in terms of image stabilization (less ghosting) compared to the CNN model. This aspect is relevant because, in a real gaming environment, visual stability is crucial; as we saw in the slow motion tests, Stabilization improvements make a significant difference.
As for the artifacts, the model Transform shows an increase in them, at least in this initial stageThese imperfections will likely improve over time, but when analyzing a card at launch, it's important to consider what's available at the time.
Between both models, I choose the model Transformers, because the stability improvement is more noticeable when playing in real time.
With this, we can move on to the next section of testing, DLSS4 and Multi Frame Generation.
DLSS 4 Multi Frame Generation – Turbocharged three or four times (3X, 4X)
DLSS 4 Multi Frame Generation is the evolution of superscaling technology NVIDIA. Unlike previous versions, it can generate several additional frames for each processed frame, thanks to a new AI model (Transformer) and hardware component Flip Metering, present in the GeForce RTX 50. This combination allows for a considerable increase (3X, 4X) in frame rate without drastically increasing latency. Additionally, adjustments have been made to improve the visual quality in fast scenes and reduce artifacts (as we saw in the previous section), although some of them are still present in this first implementation.
DLSS Frame Generation conventional, works at 2X.
Note: DLSS MFG can also be used on CNN models, but NVIDIA will opt to improve models using Transformers going forward.
NVIDIA GeForce RTX 5090 vs GeForce RTX 4090 (DLSS 4 Multi Frame Generation 3X/4X vs Frame Generation 2X)
With just a few hours before the embargo is lifted, I can only measure the performance difference (but not check for possible visual differences) between frame generation y Multi Frame Generation. Therefore, I will do it in a fairly simple way, taking as a reference the GeForce RTX 4090 in mode only Raster, that is, without using technologies to increase visual quality (Ray Tracing/Path Tracing).
Alan Wake 2 (DLSS 4 Multi Frame Generation)
Applying all available technologies, both at the visual level and in terms of superscaling and frame generation, the GeForce RTX 4090 (top of the RTX 40 series) gets 71 FPS on average, while the new GeForce RTX 5090, under the same configuration, reaches 91 FPS. This represents a generational improvement of the 28.16% en Alan Wake 2. Remember that the game now also has a new preset Ray Tracing (Ultra).
DLSS 4 Multi Frame Generation allows for more than double the performance offered by Frame Generation, thanks to options such as 3X y 4X in the series GeForce RTX 50. With the RTX 5090 in 3X mode, you get a 85.91% more frames perceived by the end user, while in mode 4X, the average of 171 FPS represents an improvement in 140.84% compared to the best the GeForce RTX 4090 can achieve.
Due to lack of time, I have not been able to do a detailed visual comparison of DLSS 4 Multi Frame Generation, but at first glance I have not noticed any serious problems, so additional tests will have to be carried out to evaluate its image quality.
Cyberpunk 2077 (DLSS 4 Multi Frame Generation)
Cyberpunk 2077 It also has a press version with DLSS 4 MFG. Under the same configuration on both cards, GeForce RTX 5090 rinde a 26.67% traditional FPS that GeForce RTX 4090. Using 3X mode, FPS increases by 84.44%, while 4X reaches an impressive average of 212 FPS, 135.55% more than what the RTX 4090 offers in its best conditions.
Preliminary analysis (DLSS 4 Multi Frame Generation)
While visual quality is yet to be analyzed, initial results and real-time experience are quite promising for the new technology exclusive to the GeForce RTX 50 series: DLSS 4 Multi Frame GenerationThe improvement in performance, as well as the remarkable fluidity achieved compared to simple rasterization (or even without RT effects), indicate an auspicious future, although I prefer to maintain a healthy skepticism until I have more tests and analysis.
Consumption – Oh Boi! (NVIDIA GeForce RTX 5090 Founders Edition)
The information available to measure consumption comes from the NVIDIA sensor, which only records the energy expenditure of the GPU. This means that it does not include the total system consumption or the additional pre-power consumption of the GPU power phase, as well as what is used through the slot PCIeBelow is a summary table of the Average consumption y maximum, focusing on 4K UHD (2160p) with rasterization, where the GPU is usually at its maximum load for gaming.
Average consumption (W) | Maximum consumption (W) | Average consumption (W) | Maximum consumption (W) | |
The Plague Tale: Requiem | 549 | 562 | 416 | 423 |
Alan wake 2 | 523 | 558 | 409 | 432 |
Baldur's Gate 3 | 475 | 516 | 384 | 403 |
Black MythWukong | 509 | 519 | 398 | 407 |
Borderlands 3 | 540 | 553 | 401 | 416 |
F1 24 | 483 | 504 | 395 | 409 |
God of War | 541 | 550 | 393 | 400 |
Marvel's Spider-Man Remastered | 452 | 477 | 354 | 384 |
Shadow of the Tomb Raider | 505 | 522 | 389 | 400 |
Shadow of War | 465 | 517 | 372 | 419 |
Star Wars: Jedi Survivor | 525 | 544 | 411 | 425 |
Strange Brigade | 566 | 592 | 395 | 427 |
Warhammer 40,000: Space Marines 2 | 484 | 512 | 380 | 395 |
Final analysis – NVIDIA GeForce RTX 5090 – A “tick” and a “tock” with Blackwell RTX 50
Taking a page from what was once Intel's "success formula" (press F), the launch of Blackwell and the GeForce RTX 5090 cards represents, in practical (but inverted) terms, a "tick" in terms of technologies that have been evolving since the appearance of the RTX 20 series. NVIDIA wants to raise the bar again with DLSS 4 Multi Frame Generation, something we should be able to test on the same day as the official launch of the GeForce RTX 5090. At the same time, it is also a “tock” in terms of the improvement of the microarchitecture, with an SM design optimized for tasks such as AI, among others.
If anyone has a better analogy, let me know; it may not be the best one (and I might even get hanged for it). To put it in a simpler way:
- The improvement in terms of raw power is minimal, but the optimization of the architecture previously used in GeForce RTX 40 for modern uses (IA) and its application in games has improved.
The reason why many might criticize that there is no “significant advancement” in terms of raw power is explained simply. NVIDIA, in general terms, continues to use the same manufacturing process as with the GeForce RTX 40 series, with slight optimizations. Since there is no competition in this segment, they have opted for this same process nodeThe reason? It could be a lack of competitive pressure, or simply that when the architecture was designed, TSMC's 3nm process was not yet mature enough for mass production. Remember, the development of a GPU or a new architecture takes place years before its launch, and Blackwell made its debut in March 2024.
Before we look at raster performance, let's remember what we said at the beginning:
- 25% increase in price (MSRP).
- 33% more VRAM (32GB GDDR7 vs. 24GB GDDR6X), plus it's faster.
- Upgrade to PCIe Gen 5.0.
- 27.77% increase in TGP (575 W vs. 450 W).
- 20.83% more transistors (92.2 billion vs. 76.3 billion).
- GPU chip size 23.25% larger than AD102.
- Same manufacturing process as RTX 40 series (TSMC 4nm 4N NVIDIA Custom Process).
- Even though it is the same process, there are architectural changes (SM, RT cores, Tensor cores).
Rasterization – Slight generational improvement over the RTX 4090
I would have liked to test the performance of the new one GeForce RTX 5090 in raster to 450w, to compare “apples to apples” and see the generational difference under the same TGP. If we put aside the new AI options in the RTX 5090 for a moment, we could consider it as a kind of “BFG RTX 4090” with more VRAM:
- Larger size.
- Higher cost.
- Increased consumption.
- Higher performance.
- More VRAM.
The NVIDIA GeForce RTX 5090 performs, on average, a 33.06% more in rasterization at 2160p. This number is much lower than the huge jump between the RTX 3080 Ti/3090 and the RTX 4090, which was quite noticeable (with its respective price increase). One of the reasons for that big jump between 3090 and 4090 was the manufacturing process.
Compared to the GeForce RTX 4080 Super, the RTX 5090 offers 81.86% more performance in pure raster. Rasterization will continue to be a relevant factor when evaluating video cards, but the way AAA games evolve means that it is not the only important element in the analysis.
Before we move on to a raster cost-per-frame study and then to first impressions of DLSS 4 Multi Frame Generation, I want to raise another perspective, which some will agree with and others won’t. Generally, when we talk about a high-end card, the price increase is not linear to the performance improvement. That is, if the price of the RTX 5090 increases by 25%, the raw power could have gone up by 20%, for example. Another way of looking at it is that NVIDIA could have launched the card at $2499 (a 56.28% increase in price) and only delivered 33.06% more raw performance, since in high-end products there is no linear correlation between price increase and performance increase. With the lack of competition at the top of the pyramid, such a scenario would not have been difficult to imagine. Fortunately, that is not the case, and the 33.06% improvement is associated with a 25% increase in price. It is a slight improvement (not counting the increase in VRAM and its new technology).
Ultimately, the generational gain in raw power is minimal, but for those who enjoy AAA games, the RTX 5090 can offer much more than just raster (mainly thanks to DLSS 4 MFG).
Cost per FPS – 2160p – NVIDIA GeForce RTX 5090 Founders Edition
La RTX GeForce 4070 SUPER It is still a price/performance card, regardless of DLSS 4 MFG Series RTX50. Something that is not adequately reflected in an analysis of Cost per FPS It is the experience that each graphics card offers at its target resolution.. In short, although the GeForce RTX 4070 SUPER has a lower cost per frame (lower is better), It is not a good option for 2160p as it does not reach enough FPS for a satisfactory experience.The GeForce RTX 4080 SUPER also features a lower cost per frame than the RTX 5090 and, unlike the 4070 SUPER, may be suitable for 2160p.
The important thing about the table is that, without AI involved, the GeForce RTX 5090 offers a lower cost per frame than the GeForce RTX 4090It’s a small victory for the RTX 5090, but now we need to talk about the elephant in the room: DLSS Multi Frame Generation (the Crème de la crème).
NVIDIA GeForce RTX 5090 – XanxoGaming cost per frame | |
GeForce RTX 4070 Super ($599) | $7.88 |
GeForce RTX 4060 Dual OC ($299) | $8.03 |
GeForce RTX 4070 ($549) | $8.30 |
GeForce RTX 4060 Ti ($399) | $8.52 |
GeForce RTX 4070 Ti Super TUF ($849) | $9.21 |
GeForce RTX 4080 Super ($999) | $9.29 |
GeForce RTX 5090 ($1999) | $10.23 |
GeForce RTX 4090 ($1599) | $10.88 |
GeForce RTX 3080 Ti ($1199) | $14.69 |
DLSS Multi Frame Generation – The added value of the GeForce RTX 50 series since its launch
My recomendation is Test the technology—if you have the opportunity—before discarding its usefulnessOn a personal level, I have used in the last year DLSS Super Resolution y frame generation in various games AAA, y I think it improves the gaming experience substantially, despite some imperfections that may arise.
So far, I used to disable Ray Tracing/Path Tracing in my games to keep high FPS on my 4K UHD monitor and achieve good fluidity. The performance penalty for enabling these features was not worth the visual bonus. However, DLSS 4 Multi Frame Generation allows you to use all these functions simultaneously, including Ray/Path Tracing, with enviable fluidity (4X). As I mentioned in the corresponding section, I still need to do an image quality analysis to DLSS 4 MFG, but the performance numbers and initial impression are very promising. With the modes 3X y 4X, the perceived frames are considerably increased, So gaming in 4K UHD at 240 Hz in AAA games with everything to the maximum is starting to become a reality..
Of course the model Transformer, Introduced with the GeForce RTX 50 series, it still has imperfections that some users will notice if they look closely or play the game at a slower speed.. My suggestion to NVIDIA is to continue training the model to reduce or eliminate these imperfections with future updates. Overall, the first impression is quite positive.
Final Words – Neural Rendering and RTX Mega Geometry
I have not gone into these in depth. two new functions GeForce RTX 50 series graphics cards because they are still quite abstract in practice. According to the instructions of NVIDIA to reviewers, "Mega Geometry” is already present in Alan wake 2, but measuring its impact or visual improvement has not been possible yet. Hogwarts Legacy features Raytracing Geometry, but it should not be equivalent to RTX Mega Geometry.
As for the Neural Rendering, is also a novelty in the GeForce RTX 50, although it remains in an initial phase (just announcement?). During the Editor's Day a was shown demo (Zorah) with several new technologies that looked impressive, and I don't doubt (or at least hope) that the first game with Neural Rendering y Neural Materials arrive soon. Until then, we can't do any real testing.
Technical Demo: Zorah
If we compare this launch with that of the GeForceRTX 4090, it is evident that Blackwell for gaming it does not reach the same level of impact as Ada Lovelace had at one timeHowever, as more games incorporate support for DLSS 4 Multi Frame Generation, Blackwell's perception in gaming could improve exponentially. NVIDIA promises 75 games DLSS 4 compatible on day 0, although this could mean simply updating the CNN model to the Transformer.
Due to time constraints, I was unable to measure AI performance, but the card, with its new floating-point instructions and large amount of VRAM, should perform excellently in that area.
Long day…
As regards consumption, it is true that it has skyrocketed (550W-575W TGP), but the engineers of NVIDIA They have developed a heat dissipation system with a cutting-edge design. Despite the high consumption, the card can operate normally, even in the hot summer of Lima, Peru.
For those who already own one GeForceRTX 4090, The decision to upgrade to the RTX 5090 will depend on whether you are happy with your current GPU.. You may want to squeeze the most out of your viewing experience by enabling advanced features like Ray/Path Tracing with DLSS 4 MFG to achieve much higher fluidity. However, I think the main interest of many users will be focused on the upcoming GeForce RTX 5080 (and eventually the RTX 5070), where the cost/performance ratio would make it much more attractive for that market segment.
NVIDIA GeForce RTX 5090 Founders Edition - Review
-
Economical performance
-
Temperatures
-
Noise
-
Consumption
-
Price
-
Innovation
Overall
Summary
La NVIDIA GeForce RTX 5090 comes as the graphics card more powerful (and expensive) of the moment for video games, although its improvements in rasterization compared to the previous generation are limited. Its performance is approximately 33.06% higher than the GeForce RTX 4090, accompanied by a price increase close to 25%.
Another of its novelties is the increase in the VRAM a 32 GB, which makes it very atractive for content creators professionals, although at the cost of a high energy consumption.
On the other hand, tests with DLSS 4 Multi Frame Generation demonstrate the potential of the IA applied in gaming, offering notable improvements resulting promising for gaming.
Pros
-The most powerful consumer-grade graphics card on the planet.
-DLSS 4 Multi Frame Generation looks promising based on our tests.
-32GB of VRAM makes it a good choice for professional content creators.
-Founders Edition has cutting-edge engineering in terms of dissipation (uses 2 slots instead of 3).
-Now comes with DisplayPort 2.1b with UHBR20 and HDMI 2.1b.
Cons
-High price (+25% more than its predecessor).
-Quite high consumption.
-Limited games using DLSS 4 MFG at launch (5 games?).