YoGA features » Historique » Version 2
Julien Brule, 18/06/2012 15:43
1 | 1 | Julien Brule | h1. YoGA features |
---|---|---|---|
2 | |||
3 | A list of features with corresponding wrappers is provided in the following. All these operations can be performed on a YoGA object. This webpage may not be up to date. Please refer to the file yoga.i for a complete list of available features. To get a full description of the syntax please use the help function in a Yorick session. |
||
4 | |||
5 | h2. General utilities |
||
6 | |||
7 | * extern _GetMaxGflopsDeviceId get the ID of the best CUDA-capable device on your system |
||
8 | * extern _setDeviceId set the active device to the specified ID |
||
9 | * func setDeviceId set the active device to the specified ID and returns its name |
||
10 | * extern _listDevice returns the list of CUDA-capable devices on your system |
||
11 | * extern _nbDevice returns the number of CUDA-capable devices on your system |
||
12 | * extern _yogaThreadExit exist threads on the active device |
||
13 | * extern _yogaThreadSync synchronizes threads on the active device |
||
14 | * extern _yogaInit init a YoGA session on the specified device |
||
15 | * extern _yogaInitCublas init a CUBLAS session on the active device |
||
16 | * extern _yogaShutdownCublas shutdown the CUBLAS session on the active device |
||
17 | |||
18 | |||
19 | h2. Arrays manipulations |
||
20 | |||
21 | 2 | Julien Brule | * extern yoga_obj creates an array on the GPU |
22 | * extern yoga_host2device fills an object on the GPU with data from the Yorick session |
||
23 | * extern yoga_device2host transfers data from an object on the GPU to the Yorick session |
||
24 | * extern yoga_setv create a new cublasVector from input data |
||
25 | * extern yoga_setm create a new cublasMatrix from input data |
||
26 | * extern yoga_getarray get a sub-array of input object specified by a range |
||
27 | * extern yoga_fillarray fill a sub-array of input object specified by a range |
||
28 | * extern yoga_getarray get the value of an array at specified position |
||
29 | * extern yoga_plus add a scalar to all the elements of an array |
||
30 | * extern yoga_plusai add a scalar (an element of a source array) to all the elements of an array |
||
31 | |||
32 | |||
33 | 1 | Julien Brule | h2. Matrix Operations |
34 | |||
35 | 2 | Julien Brule | YoGA provides support for most of the cublas functions as well as an autotuned custom tranpose method from the NVIDIA SDK. |
36 | |||
37 | 1 | Julien Brule | h3. BLAS functions |
38 | |||
39 | 2 | Julien Brule | * extern yoga_imax returns the smallest index of the maximum magnitude element of obj |
40 | * extern yoga_imin returns the smallest index of the minimum magnitude element of obj |
||
41 | * extern yoga_asum retuns the sum of the absolute values of obj |
||
42 | * extern yoga_nrm2 returns the Euclidean norm of obj |
||
43 | * extern yoga_scale scales vectx by an amount specified by the second argument |
||
44 | * extern yoga_swap swaps the content of the 2 arguments |
||
45 | * extern yoga_axpy multiplies vectx by alpha and adds it to vecty |
||
46 | * extern yoga_dot computes the dot product of the 2 arguments |
||
47 | * extern yoga_mv multiplies matrix A by vectx and optional alpha and stores it in optional beta times vecty |
||
48 | * extern yoga_rank1 general rank1 operation adds to matrix dest the product of transpose(vectx) and vecty |
||
49 | * extern yoga_mm multiplies matrix A by matrix B and stores it in matrix C |
||
50 | |||
51 | |||
52 | |||
53 | 1 | Julien Brule | h3. Transpose |
54 | |||
55 | 2 | Julien Brule | * extern yoga_transpose transposes the matrix src and places the result in matrix dest |
56 | |||
57 | 1 | Julien Brule | h2. Random number generation |
58 | |||
59 | 2 | Julien Brule | YoGA provides support for the curand libraries. Call to curand are made through custom kernels. Two types of noise distribution are provided : uniform and normal. |
60 | |||
61 | * extern yoga_random generates a uniform distribution of random numbers |
||
62 | * extern yoga_random_n generates a normal distribution of random numbers |
||
63 | |||
64 | |||
65 | 1 | Julien Brule | h2. Fast Fourier Transform |
66 | |||
67 | 2 | Julien Brule | YoGA provides support for the cufft library. |
68 | |||
69 | * extern yoga_fft computes the Fast Fourier Transform of input array |
||
70 | * extern yoga_fftconv_init inits the FFT convolution workspace |
||
71 | * extern yoga_fftconv computes the FFT convolution of two arrays |
||
72 | |||
73 | |||
74 | |||
75 | 1 | Julien Brule | h2. Scan, sort, compact |
76 | 2 | Julien Brule | |
77 | YoGA provides support for the CUDPP library. Users can scan arrays for min, max, sum, etc ... sort arrays or compact arrays. |
||
78 | |||
79 | * extern yoga_min returns the minimum value of an array |
||
80 | * extern yoga_max returns the maximum value of an array |
||
81 | * extern yoga_add returns the sum of the values of an array |
||
82 | * extern yoga_mult returns the product of the values of an array |
||
83 | * extern yoga_sort sorts the values of an array and optionally returns an array of sorted indexes array |
||
84 | * extern yoga_compact compacts an array using a vector of valid elements |