by Maxim Knepfle, CTO Tygron
For over a decade, users of the Tygron Platform have conducted a range of simulations, such as flooding scenarios, heatwaves, climate adaptation projects and AI models, all powered by our platform’s GPUs. Thanks to this success, the number of simulations conducted has risen to nearly 1.000.000 annually. To meet this demand and continue delivering even better results for our users, we need to run more simulations at faster speeds in a cost-effective way.
However, the latest GPU hardware, primarily manufactured in Taiwan, has become increasingly costly and difficult to acquire, driven by intense global competition in the AI sector.
This situation led us to reassess our development roadmap, investigating the feasibility of running 64-bit numeric models on more cost-effective GPUs optimized for 32-bit performance.
In April 2025 came the answer when NVIDIA launched a groundbreaking series of datacenter GPUs known as the RTX PRO 6000 (Server edition), utilizing the same Blackwell chips as the more costly and scarce B100. This new RTX 6000 exhibits even better 32-bit performance (126 vs 124 TFLOPS) but has fewer 64-bit cores then the B100.
Performance Bottlenecks
When testing our Water Module in NVIDIA Nsight Compute, we had already identified a significant performance bottleneck in solving the Saint Venant equations, which requires 64-bit values. Using 32-bit values occasionally leads to a phenomenon known as Catastrophic Cancellation.
Catastrophic Cancellation occurs when significant digits are lost in numerical calculations due to poor choices of arithmetic methods, especially when adding numbers with a large difference, resulting in a less accurate outcome. In our case “leaking” water volumes.
Example of Catastrophic Cancellation
Consider two 32-bit numbers: 1000 and 0,00001
When you add them together: 1000 + 0,00001 = 1000
Instead of yielding the correct 1000,00001, this operation results in a smaller and less accurate value, which over many iterations comprises numerical stability.
.
Solutions from the Past
To tackle this issue, we revisited older works by William Kahan, winner of the famous Turing Award (“Nobel Prize of Computing”) for his contributions to numerical analysis. In the 1960s and 1970s, computing power was limited and expensive, prompting Kahan to develop algorithms to compensate in summations. But later also differences in products, for instance: 33962.035 * -30438.8 – 41563.4 * -24871.969 in 32-bit yields an incorrect result.
Using Kahan’s Algorithm solves this issue, but requires a Fused Multiply-Add (FMA) operations, which are not available on most CPUs. However, GPUs do support FMA, allowing for the correct implementation of the algorithm.
Integration and Testing
The next step involved integrating these new algorithms into the Water Module while ensuring that the answers remained consistent. Our development process, rooted in principles from David Farley’s Modern Software Engineering, has led us to create thousands of unit tests over the years for a variety of water simulations; including floods, rainfall, lakes and many more.
Through a test-driven development approach, we modified small parts of the original code to include Kahan’s algorithms. We iteratively ran the amended code against all unit tests to verify stability. After several iterations (including some dead ends) we successfully solved most of the Saint Venant equations using 16-bit and 32-bit operations that produced results identical to those of the original 64-bit.
Performance Gains
A few months later, we proudly welcomed our first RTX PRO 6000 Server, smashing records with an astonishing performance of over 1.000.000.000.000.000 (Peta FLOPS) floating-point operations per second!
After further testing on our Preview Server, we can now confidently state that our Watermodule, along with all other simulations, maximize the use of 32-bit (or lower) operations, leading to remarkable performance gains:
- Hardware upgrade: 2x faster
- Software upgrade: 2x faster
- Total performance gain: 4x faster*
* In some cases, such as heatwave calculations, performance gains can reach 6 to 8 times faster.
Together with the enhanced memory capacity of the new GPUs, these improvements establish a robust foundation for the Tygron Platform, offering users faster high-precision results and a broader range of possibilities!
Start uw 10 dagen free trial
Do you also want to start integrating, simulating and presenting your data and calculation models? That’s possible! With the free trial, you can experience this.



