The Titan RTX launch was decidedly unceremonious. Members of the tech press knew that the card was coming but didn’t receive one to test. Nvidia undoubtedly knew its message would be obscured by comparisons drawn between Titan RTX and the other TU102-based card, GeForce RTX 2080 Ti, in games. Based on a complete TU102 processor, Titan RTX was bound to be faster than the GeForce in all benchmarks, regardless of discipline. However, its eye-watering $2,500 price tag would be difficult to justify for entertainment alone


But even as gamers ponder the effect of an extra four Streaming Multiprocessors on their frame rates, we all know that Titan RTX wasn’t intended for those folks. Of course, we’re still going to run it through our suite of game benchmarks. However, Nvidia says this card was designed for “AI researchers, deep learning developers, data scientists, content creators, and artists.”

Participants in those segments with especially deep pockets often spring for Tesla-based platforms or Quadro cards sporting certified drivers. Those aren’t always necessary, though. As a result, “cost-sensitive” professionals working in smaller shops find themselves somewhere in the middle, needing more than a GeForce but unable to spend $6,300 on an equivalent Quadro RTX 6000.

Out of necessity, our test suite is expanding to include professional visualization and deep learning metrics, in addition to the power consumption analysis we like to perform. Get ready for a three-way face-off between Titan RTX, Titan V, and Titan Xp (with a bit of GeForce RTX 2080 Ti sprinkled in).

PROS

  • 24GB of memory is ideal for large professional and deep learning workloads
  • Improved Tensor cores benefit inferencing performance specifically
  • Excellent 4K gaming frame rates
  • NVLink support (which Titan V lacks)
  • Attractive design


CONS

  • $2,500 price limits appeal to professionals with deep pockets
  • Axial fan design exhausts (a lot of) heat into your case
  • Poor FP64 capabilities compared to Titan V



VERDICT

Titan RTX is the right card for the right customer. It's a no-brainer if you're working with large geometry models, training neural networks with large batch sizes, or inferencing trained networks using frameworks like TensorRT with support for the hardware's features. Gaming on Titan RTX doesn't make as much sense when you could have two GeForce RTX 2080 Tis for a similar price.

Meet Titan RTX: It Starts With A Complete TU102

The Tom’s Hardware audience should be well-acquainted with Nvidia’s TU102 GPU by now: it’s the engine at the heart of GeForce RTX 2080 Ti, composed of 18.6 billion transistors, and measuring 754 square millimeters.

As it appears in the 2080 Ti, though, TU102 features 68 active Streaming Multiprocessors. Four of the chip’s 72 are turned off. One of its 32-bit memory controllers is also disabled, taking eight ROPs and 512KB of L2 cache with it.

Titan RTX is based on the same processor, but with every block active. That means the card boasts a GPU with 72 SMs, 4,608 CUDA cores, 576 Tensor cores, 72 RT cores, 288 texture units, and 36 PolyMorph engines.

Not only does Titan RTX sport more CUDA cores than GeForce RTX 2080 Ti, it also offers a higher GPU Boost clock rating (1,770 MHz vs. 1,635 MHz). As such, its peak single-precision rate increases to 16.3 TFLOPS.


Each SM does contain a pair of FP64-capable CUDA cores as well, yielding a double-precision rate that’s 1/32 of TU102’s FP32 performance, or 0.51 TFLOPS. This is one area where Titan RTX loses big to its predecessor. Titan V’s GV100 processor is better in the HPC space thanks to 6.9 TFLOPS peak FP64 performance (half of its single-precision rate). A quick run through SiSoftware’s Sandra GPGPU Arithmetic benchmark confirms Titan V’s strength, along with the mixed-precision support inherent to Turing and Volta, which Pascal lacks.

The GPU’s GPCs are fed by 12 32-bit GDDR6 memory controllers, each attached to an eight-ROP cluster and 512KB of L2 cache yielding an aggregate 384-bit memory bus, 96 ROPs, and a 6MB L2 cache. At the same 14 Gb/s data rate, one extra memory emplacement buys Titan RTX about 9% more memory bandwidth than GeForce RTX 2080 Ti.


Whereas GeForce RTX 2080 Ti Founders Edition utilizes Micron’s MT61K256M32JE-14:A modules, the company doesn’t have any 16Gb ICs in its parts catalog. Samsung, on the other hand, does offer a higher-density K4ZAF325BM-HC14 module with a 14 Gb/s data rate. Twelve of them give Titan RTX its 24GB capacity and 672 GB/s peak throughput.


Lots of extra memory, a GPU with more active resources, and faster clock rates necessitate a higher thermal design power rating. Whereas GeForce RTX 2080 Ti Founders Edition is specified at 260W, Titan RTX is a 280W card. That 20W increase is no problem at all for the pair of eight-pin auxiliary power connectors found along the top edge, nor is a challenge for Nvidia’s power supply and thermal solution, both of which appear identical to its GeForce RTX 2080 Ti.

Like the 2080 Ti Founders Edition, we count three phases for Titan RTX’s GDDR6 memory and a corresponding PWM controller up front. A total of 13 phases remain. Five phases are fed by the eight-pin connectors and doubled. With two control loops per phase, 5*2=10 voltage regulation circuits. The remaining three phases to the left of the GPU are fed by the motherboard's PCIe slot and not doubled. That gives us Nvidia's lucky number 13 (along with a smart load distribution scheme). Of course, implementing all of this well requires the right components...

Front and center in this design is uPI's uP9512 eight-phase buck controller specifically designed to support next-gen GPUs. Per uPI, "the uP9512 provides programmable output voltage and active voltage positioning functions to adjust the output voltage as a function of the load current, so it is optimally positioned for a load current transient."

The uP9512 supports Nvidia's Open Voltage Regulator Type 4i+ technology with PWMVID. This input is buffered and filtered to produce a very accurate reference voltage. The output voltage is then precisely controlled to the reference input. An integrated SMBus interface offers enough flexibility to optimize performance and efficiency, while also facilitating communication with the appropriate software. All 13 voltage regulation circuits are equipped with an ON Semiconductor FDMF3170 Smart Power Stage module with integrated PowerTrench MOSFETs and driver ICs.

Samsung’s K4ZAF325BM-HC14 memory ICs are powered by three phases coming from a second uP9512. The same FDMF3170 Smart Power Stage modules crop up yet again. The 470mH coils offer greater inductance than the ones found on the GPU power phases, but they're completely identical in terms of physical dimensions.


Under the hood, Titan RTX’s thermal solution is also the same as what we found on GeForce RTX 2080 Ti. A full-length vapor chamber covers the PCB and is topped with an aluminum fin stack. A shroud over the heat sink houses two 8.5cm axial fans with 13 blades each. These fans blow through the fins and exhaust waste heat out the card’s top and bottom edges. Although we don’t necessarily like that Nvidia recirculates hot air with its Turing-based reference coolers, their performance is admittedly superior to older blower-style configurations.