alias to element type of cuda storage
global accessor for the cuda module in grain
high-level axpy (y = alpha * x + y) wrapper for CuPtr
cudnn error checker
cublas error checker
cuda error checker
deep copy inter device memory without allocation
duplicate cuda memory (deep copy)
true if length == 0
fill value for N elements from the first position TODO use cudnnSetTensor
fill value for all the element in device array
global accessor for the cuda module in grain
test sum
copy device memory to host (maybe reallocate in host)
copy device memory to host (CAUTION: no reallocation here)
allocate host memory and copy device memory content
fill zero for all the element in device array
create zero filled N elements array
sub-region on CuPtr!T
cuda module compiled from ptx string
fat pointer in CUDA
cuda function object called by mangled name of C++/D device function F
cuda kernel function launcher with runtime numbers of blocks/threads
trait to identify cuda storage
CUDA wrapper module