YoGA features » Historique » Version 2
Julien Brule, 18/06/2012 15:43
| 1 | 1 | Julien Brule | h1. YoGA features |
|---|---|---|---|
| 2 | |||
| 3 | A list of features with corresponding wrappers is provided in the following. All these operations can be performed on a YoGA object. This webpage may not be up to date. Please refer to the file yoga.i for a complete list of available features. To get a full description of the syntax please use the help function in a Yorick session. |
||
| 4 | |||
| 5 | h2. General utilities |
||
| 6 | |||
| 7 | * extern _GetMaxGflopsDeviceId get the ID of the best CUDA-capable device on your system |
||
| 8 | * extern _setDeviceId set the active device to the specified ID |
||
| 9 | * func setDeviceId set the active device to the specified ID and returns its name |
||
| 10 | * extern _listDevice returns the list of CUDA-capable devices on your system |
||
| 11 | * extern _nbDevice returns the number of CUDA-capable devices on your system |
||
| 12 | * extern _yogaThreadExit exist threads on the active device |
||
| 13 | * extern _yogaThreadSync synchronizes threads on the active device |
||
| 14 | * extern _yogaInit init a YoGA session on the specified device |
||
| 15 | * extern _yogaInitCublas init a CUBLAS session on the active device |
||
| 16 | * extern _yogaShutdownCublas shutdown the CUBLAS session on the active device |
||
| 17 | |||
| 18 | |||
| 19 | h2. Arrays manipulations |
||
| 20 | |||
| 21 | 2 | Julien Brule | * extern yoga_obj creates an array on the GPU |
| 22 | * extern yoga_host2device fills an object on the GPU with data from the Yorick session |
||
| 23 | * extern yoga_device2host transfers data from an object on the GPU to the Yorick session |
||
| 24 | * extern yoga_setv create a new cublasVector from input data |
||
| 25 | * extern yoga_setm create a new cublasMatrix from input data |
||
| 26 | * extern yoga_getarray get a sub-array of input object specified by a range |
||
| 27 | * extern yoga_fillarray fill a sub-array of input object specified by a range |
||
| 28 | * extern yoga_getarray get the value of an array at specified position |
||
| 29 | * extern yoga_plus add a scalar to all the elements of an array |
||
| 30 | * extern yoga_plusai add a scalar (an element of a source array) to all the elements of an array |
||
| 31 | |||
| 32 | |||
| 33 | 1 | Julien Brule | h2. Matrix Operations |
| 34 | |||
| 35 | 2 | Julien Brule | YoGA provides support for most of the cublas functions as well as an autotuned custom tranpose method from the NVIDIA SDK. |
| 36 | |||
| 37 | 1 | Julien Brule | h3. BLAS functions |
| 38 | |||
| 39 | 2 | Julien Brule | * extern yoga_imax returns the smallest index of the maximum magnitude element of obj |
| 40 | * extern yoga_imin returns the smallest index of the minimum magnitude element of obj |
||
| 41 | * extern yoga_asum retuns the sum of the absolute values of obj |
||
| 42 | * extern yoga_nrm2 returns the Euclidean norm of obj |
||
| 43 | * extern yoga_scale scales vectx by an amount specified by the second argument |
||
| 44 | * extern yoga_swap swaps the content of the 2 arguments |
||
| 45 | * extern yoga_axpy multiplies vectx by alpha and adds it to vecty |
||
| 46 | * extern yoga_dot computes the dot product of the 2 arguments |
||
| 47 | * extern yoga_mv multiplies matrix A by vectx and optional alpha and stores it in optional beta times vecty |
||
| 48 | * extern yoga_rank1 general rank1 operation adds to matrix dest the product of transpose(vectx) and vecty |
||
| 49 | * extern yoga_mm multiplies matrix A by matrix B and stores it in matrix C |
||
| 50 | |||
| 51 | |||
| 52 | |||
| 53 | 1 | Julien Brule | h3. Transpose |
| 54 | |||
| 55 | 2 | Julien Brule | * extern yoga_transpose transposes the matrix src and places the result in matrix dest |
| 56 | |||
| 57 | 1 | Julien Brule | h2. Random number generation |
| 58 | |||
| 59 | 2 | Julien Brule | YoGA provides support for the curand libraries. Call to curand are made through custom kernels. Two types of noise distribution are provided : uniform and normal. |
| 60 | |||
| 61 | * extern yoga_random generates a uniform distribution of random numbers |
||
| 62 | * extern yoga_random_n generates a normal distribution of random numbers |
||
| 63 | |||
| 64 | |||
| 65 | 1 | Julien Brule | h2. Fast Fourier Transform |
| 66 | |||
| 67 | 2 | Julien Brule | YoGA provides support for the cufft library. |
| 68 | |||
| 69 | * extern yoga_fft computes the Fast Fourier Transform of input array |
||
| 70 | * extern yoga_fftconv_init inits the FFT convolution workspace |
||
| 71 | * extern yoga_fftconv computes the FFT convolution of two arrays |
||
| 72 | |||
| 73 | |||
| 74 | |||
| 75 | 1 | Julien Brule | h2. Scan, sort, compact |
| 76 | 2 | Julien Brule | |
| 77 | YoGA provides support for the CUDPP library. Users can scan arrays for min, max, sum, etc ... sort arrays or compact arrays. |
||
| 78 | |||
| 79 | * extern yoga_min returns the minimum value of an array |
||
| 80 | * extern yoga_max returns the maximum value of an array |
||
| 81 | * extern yoga_add returns the sum of the values of an array |
||
| 82 | * extern yoga_mult returns the product of the values of an array |
||
| 83 | * extern yoga_sort sorts the values of an array and optionally returns an array of sorted indexes array |
||
| 84 | * extern yoga_compact compacts an array using a vector of valid elements |