Powerful and reliable programming model and computing toolkit

Report Share

NVIDIA CUDA Toolkit 12.8.0 (for Windows 11)

January, 24th 2025 - 3.2 GB - Freeware

Free Download

Security Status

Latest Version

NVIDIA CUDA Toolkit 12.9.1 (for Windows 11)
Operating System

Windows 11
User Rating

Click to vote
Author / Product

NVIDIA Corporation / External Link
Filename

cuda_12.8.0_571.96_windows.exe

Sometimes latest versions of the software can cause issues when installed on older devices or devices running an older version of the operating system.

Software makers usually fix these issues but it can take them some time. What you can do in the meantime is to download and install an older version of NVIDIA CUDA Toolkit 12.8.0 (for Windows 11).

For those interested in downloading the most recent release of NVIDIA CUDA Toolkit or reading our review, simply click here.

All old versions distributed on our website are completely virus-free and available for download at no cost.

We would love to hear from you

If you have any questions or ideas that you want to share with us - head over to our Contact page and let us know. We value your feedback!

Download NVIDIA CUDA Toolkit 12.8.0 (for Windows 11)

Why is this app published on FileHorse? (More info)

What's new in this version:

New Features:
This release adds compiler support for the following Nvidia Blackwell GPU architectures:
- SM_100
- SM_101
- SM_120

Tegra-Specific:
- Added MPS support for DRIVE OS QNX
- Added support for GCC 13.2.0
- Added support for Unified Virtual Memory (UVM) with Extended GPU Memory (EGM) arrays

Hopper Confidential Computing:
- Added multi-GPU support for protected PCIe mode
- Added key rotation capability for single GPU passthrough mode

NVML Updates:
- Fixed per-process memory usage reporting for Docker containers using Open GPU Kernel Module drivers
- Added support for DRAM encryption query and control (Blackwell)
- Added checkpoint/restore functionality for userspace applications
- Added support for Blackwell reduced bandwidth mode (RBM)

CUDA Graphs:
- Added conditional execution features for CUDA Graphs:
- ELSE graph support for IF nodes
- SWITCH node support
- Introduced additional performance optimizations
- CUDA Usermode Driver (UMD):
- Added PCIe device ID to CUDA device properties
- Added cudaStreamGetDevice and cuStreamGetDevice APIs to retrieve the device associated with a CUDA stream
- Added CUDA support for INT101010 texture/surface format
- Userspace Checkpoint and Restore:
- Added cross-system process migration support to enable process restoration on a computer different from the one where it was checkpointed
- Added new driver API for checkpoint/restore operations
- Added batch CUDA asynchronous memory copy APIs (cuMemcpyBatchAsync and cuMemcpyBatch3DAsync) for variable-sized transfers between multiple source and destination buffers

CUDA Compiler:
Added two new nvcc flags:
- static-global-template-stub {true|false}: Controls host side linkage for global/device/constant/managed templates in whole program mode
- device-entity-has-hidden-visibility {true|false}: Controls ELF visibility of global/device/constant/managed symbols
- The current default value for both flags is false. These defaults will change to true in our future release. For detailed information about these flags and their impact on existing programs, refer to the nvcc --help command or the online CUDA documentation.
- libNVVM now supports compilation for the Blackwell family of architectures. Compilation of compute capabilities compute_100 and greater (Blackwell and future architectures) uses an updated NVVM IR dialect, based on LLVM 18.1.8 IR (the “modern” dialect) that differs from the older dialect used for pre-Blackwell architectures (a compute capability less than compute_100). NVVM IR bitcode using the older dialect generated for pre-Blackwell architectures can be used to target Blackwell and later architectures, with the exception of debug metadata.
- Nvdisasm now supports emitting JSON formatted SASS disassembly

CUDA Developer Tools:
- For changes to nvprof and Visual Profiler, see the changelog
- For new features, improvements, and bug fixes in Nsight Systems, see the changelog
- For new features, improvements, and bug fixes in Nsight Visual Studio Edition, see the changelog
- For new features, improvements, and bug fixes in CUPTI, see the changelog
- For new features, improvements, and bug fixes in Nsight Compute, see the changelog
- For new features, improvements, and bug fixes in Compute Sanitizer, see the changelog
- For new features, improvements, and bug fixes in CUDA-GDB, see the changelog

Fixed:
CUDA Compiler:
- Resolved compilation issues where code that successfully built with GCC would fail to compile with NVCC on Ubuntu 24.04. This improves cross-compiler compatibility and ensures consistent behavior between GCC and NVIDIA’s CUDA compiler toolchain
- Fixed incorrect handling of C++20 requires expressions, restoring proper functionality and standard compliance. This ensures that compile-time requirements on template parameters now evaluate correctly
- Fixed an issue where NVCC (NVIDIA Compiler Driver) was ignoring the global namespace prefix of a type and thus incorrectly resolving it to a local type that shares the same name
- Fixed a compilation error in NVCC that occurred when code contained three or more nested lambda expressions with variadic arguments. The compiler now properly handles deeply nested variadic lambdas
- Fixed a limitation in NVRTC that caused compilation failures when kernel functions had long identifiers. The runtime compiler now properly handles kernel functions with extended name lengths
- Resolved an issue where template alias resolution could produce incorrect template instances. Previously, when an alias template and its underlying type-id template had different default arguments, the compiler would sometimes incorrectly omit the differing default argument when substituting the alias with its underlying type. This resulted in references to incorrect template instances. The template argument resolution now properly preserves all necessary default arguments during alias substitution
- Fixed invalid error reporting when using variables as template arguments from outside their visible scope. This resolves incorrect diagnostic messages particularly affecting cases involving braced initializers. The compiler now properly validates scope accessibility for template arguments
- Added the ability to cancel ongoing NVRTC compilations through callback mechanisms. This new feature allows developers to safely interrupt and terminate compilation processes programmatically
- The semantics of the -expt-relaxed-constexpr nvcc flag are now documented in the “C++ Language Support” section of the CUDA Programming Guide