Coding CUDA with C#?

2023-03-15 13:17 问答作者：

I've been looking for some information on coding CUDA (the nvidia gpu language) with C#. I have seen a few of the libraries, but it seems that they would add a bit of overhead (because of the p/invokes, etc).

How should I go about using CUDA in my C# applications? Would it be better to code it in say C++ and compile that into a dll?
Would this overhead of using a wrapper kill any advantages I would get from using CUDA?
And are there a开发者_如何学Pythonny good examples of using CUDA with C#?

There is such a nice complete cuda 4.2 wrapper as ManagedCuda. You simply add C++ cuda project to your solution, which contains yours c# project, then you just add

call "%VS100COMNTOOLS%vsvars32.bat"
for /f %%a IN ('dir /b "$(ProjectDir)Kernels\*.cu"') do nvcc -ptx -arch sm_21 -m 64 -o "$(ProjectDir)bin\Debug\%%~na_64.ptx" "$(ProjectDir)Kernels\%%~na.cu"
for /f %%a IN ('dir /b "$(ProjectDir)Kernels\*.cu"') do nvcc -ptx -arch sm_21 -m 32 -o "$(ProjectDir)bin\Debug\%%~na.ptx" "$(ProjectDir)Kernels\%%~na.cu"

to post-build events in your c# project properties, this compiles *.ptx file and copies it in your c# project output directory.

Then you need simply create new context, load module from file, load function and work with device.

//NewContext creation
CudaContext cntxt = new  CudaContext();

//Module loading from precompiled .ptx in a project output folder
CUmodule cumodule = cntxt.LoadModule("kernel.ptx");

//_Z9addKernelPf - function name, can be found in *.ptx file
CudaKernel addWithCuda = new CudaKernel("_Z9addKernelPf", cumodule, cntxt);

//Create device array for data
CudaDeviceVariable<cData2> vec1_device = new CudaDeviceVariable<cData2>(num);            

//Create arrays with data
cData2[] vec1 = new cData2[num];

//Copy data to device
vec1_device.CopyToDevice(vec1);

//Set grid and block dimensions                       
addWithCuda.GridDimensions = new dim3(8, 1, 1);
addWithCuda.BlockDimensions = new dim3(512, 1, 1);

//Run the kernel
addWithCuda.Run(
    vec1_device.DevicePointer, 
    vec2_device.DevicePointer, 
    vec3_device.DevicePointer);

//Copy data from device
vec1_device.CopyToHost(vec1);

This has been commented on the nvidia list in the past:

http://forums.nvidia.com/index.php?showtopic=97729

it would be easy to use P/Invoke to use it in assemblies like so:

  [DllImport("nvcuda")]
  public static extern CUResult cuMemAlloc(ref CUdeviceptr dptr, uint bytesize);

I guess Hybridizer, explained here as a blog post on Nvidia is also worth to mention. Here is its related GitHub repo it seems.

Update 1

Altimesh Hybridizer is an advanced productivity tool that generates vectorized C++ source code (AVX) and CUDA C source code from .NET assemblies (MSIL) or Java archives (java bytecode).

Coding CUDA with C#?

In managed development environments, developers can use virtual functions and generics, yet make efficient use of the GPU's compute capabilities, with ~80% usage of peak performance of processors and memory. From a single version of the source code, developers can debug and execute on both CPU and CUDA GPU, within your favorite development environment, by stepping into original source code (.NET or Java). Applications can be profiled using state of the art solutions such as VTUNE and Nsight, referencing locations in the original source code.

Key features

Source code generation from Java or .NET binaries (Java bytecode / MSIL bytecode)
Full debugging /profiling integration with NVIDIA Nsight for Visual Studio Edition
Support of virtual functions, generics – mapped onto C++ templates for optimal performances
Single version of the input targeting GPU and CPU with near-optimal performance (automatic vectorization)
Code generation agnostic of operating system: e.g. develop in dot net on windows, debug GPU code in Nsight Visual Studio Edition, deploy on linux in a Java system
Non-intrusive environment: Hybridizer is attribute/annotations based, hence the solution will still run without it, probably slower

Hybridizer comes in two versions:

Hybridizer Software Suite: enables CUDA, AVX, AVX2, AVX512 targets and outputs source code. This source code can be reviewed, which is mandatory in some businesses such as investment banks. Hybridizer Software Suite is licensed per customer upon request.
Hybridizer Essentials: enables only the CUDA target and outputs only binaries. Hybridizer Essentials is a free Visual Studio extension with no hardware restrictions. You can find a set of basic code samples and educational material on GitHub. These samples also serve as a way to reproduce our performance results.

There are several alternatives you can use to use CUDA in your C# applications.

Write a C++/CUDA library in a separate project, and use P/Invoke. The overhead of P/invokes over native calls will likely be negligible.
Use a CUDA wrapper such as ManagedCuda(which will expose entire CUDA API). You won't have to write your DLLImports by hand for the entire CUDA runtime API (which is convenient). Unfortunely, you will still have to write your own CUDA code in a separate project.
(recommended) You can use free/opensource/proprietary compilers (which will generate cuda (either source or binary) from your c# code.

You can find several of them online : have a look at this answer for example.

Coding CUDA with C#?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集 河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？