Powerful and reliable programming model and computing toolkit

NVIDIA CUDA Toolkit

NVIDIA CUDA Toolkit 12.0.1 (for Windows 10)

  -  3.1 GB  -  Freeware

Sometimes latest versions of the software can cause issues when installed on older devices or devices running an older version of the operating system.

Software makers usually fix these issues but it can take them some time. What you can do in the meantime is to download and install an older version of NVIDIA CUDA Toolkit 12.0.1 (for Windows 10).


For those interested in downloading the most recent release of NVIDIA CUDA Toolkit or reading our review, simply click here.


All old versions distributed on our website are completely virus-free and available for download at no cost.


We would love to hear from you

If you have any questions or ideas that you want to share with us - head over to our Contact page and let us know. We value your feedback!

What's new in this version:

New meta-packages for Linux installation:
- cuda-toolkit
- Installs all CUDA Toolkit packages required to develop CUDA applications
- Handles upgrading to the latest version of CUDA when it’s released
- Does not include the driver
- cuda-toolkit-12
- Installs all CUDA Toolkit packages required to develop CUDA applications
- Handles upgrading to the next 12.x version of CUDA when it’s released
- Does not include the driver
- New CUDA API to enable mini core dump programmatically is now available

.2.2. CUDA Compilers:
- NVCC has added support for host compiler: GCC 12.2, NVC++ 22.11, Clang 15.0, VS2022 17.4
- Breakpoint and single stepping behavior for a multi-line statement in device code has been improved, when code is compiled with nvcc using gcc/clang host compiler compiler or when compiled with NVRTC on non-Windows platforms. The debugger will now correctly breakpoint and single-step on each source line of the multiline source code statement.
- PTX has exposed a new special register in the public ISA, which can be used to query total size of shared memory which includes user shared memory and SW reserved shared memory.
- NVCC and NVRTC now show preprocessed source line and column info in a diagnostic to help users to understand the message and identify the issue causing the diagnostic. The source line and column info can be turned off with --brief-diagnostics=true.

.2.3. CUDA Developer Tools:
- For changes to nvprof and Visual Profiler, see the changelog
- For new features, improvements, and bug fixes in CUPTI, see the changelog
- For new features, improvements, and bug fixes in Nsight Compute, see the changelog
- For new features, improvements, and bug fixes in Compute Sanitizer, see the changelog
- For new features, improvements, and bug fixes in CUDA-GDB, see the changelog

.3. Deprecated or Dropped Features:
- Features deprecated in the current release of the CUDA software still work in the current release, but their documentation may have been removed, and they will become officially unsupported in a future release. We recommend that developers employ alternative solutions to these features in their software.

General CUDA:
- CentOS Linux 8 reached End-of-Life on December 31, 2021. Support for this OS is now removed from the CUDA Toolkit and is replaced by Rocky Linux 8.
- Server 2016 support has been deprecated and shall be removed in a future release
- Kepler architecture support is removed from CUDA 12.0
- CUDA 11 applications that relied on Minor Version Compatibility are not guaranteed to work in CUDA 12.0 onwards. Developers will either need to statically link their applications, or recompile within the CUDA 12.0 environment to ensure continuity of development.

From 12.0, JIT LTO support is now part of CUDA Toolkit. JIT LTO support in the CUDA Driver through the cuLink driver APIs is officially deprecated. Driver JIT LTO will be available only for 11.x applications. The following enums supported by the cuLink Driver APIs for JIT LTO are deprecated:
- CU_JIT_INPUT_NVVM
- CU_JIT_LTO
- CU_JIT_FTZ
- CU_JIT_PREC_DIV
- CU_JIT_PREC_SQRT
- CU_JIT_FMA
- CU_JIT_REFERENCED_KERNEL_NAMES
- CU_JIT_REFERENCED_KERNEL_COUNT
- CU_JIT_REFERENCED_VARIABLE_NAMES
- CU_JIT_REFERENCED_VARIABLE_COUNT
- CU_JIT_OPTIMIZE_UNUSED_DEVICE_VARIABLES
- Existing 11.x CUDA applications using JIT LTO will continue to work on the 12.0/R525 and later driver. The driver cuLink API support for JIT LTO is not removed but will only support 11.x LTOIR. The cuLink driver API enums for JIT LTO may be removed in the future so we recommend transitioning over to CUDA Toolkit 12.0 for JIT LTO.
- .0 LTOIR will not be supported by the driver cuLink APIs. 12.0 or later applications must use nvJitLink shared library to benefit from JIT LTO.
- Refer to the CUDA 12.0 blog on JIT LTO for more details

CUDA Tools:
- CUDA-MEMCHECK is removed from CUDA 12.0, and has been replaced with Compute Sanitizer

CUDA Compiler:
- bit compilation native and cross-compilation is removed from CUDA 12.0 and later Toolkit. Use the CUDA Toolkit from earlier releases for 32-bit compilation. CUDA Driver will continue to support running existing 32-bit applications on existing GPUs except Hopper. Hopper does not support 32-bit applications. Ada will be the last architecture with driver support for 32-bit applications.