sharing cuda texture among multiple streams -
i have texture using reading image. so, texture defined as:
texture<uchar4, 2, cudareadmodenormalizedfloat> text; i have cuda kernel uses texture read image pixel value as:
__global__ void resample_2d(float4* result, int width, nt height, float* x, float* y) { const int _x = blockdim.x * blockidx.x + threadidx.x; const int _y = blockdim.y * blockidx.y + threadidx.y; if (_x < width && _y < height) { const int = _y * width + _x; result[i] res = tex2d<float4>(text, x[i] + 0.5f, y[i] + 0.5f); } } now, have 4 cuda streams reading texture (so accessing same image bound texture). so, question take performance hit? so, better have 4 textures (one each stream) rather 1 texture used streams in terms of performance?
textures in cuda work cached memory. having multiple streams on smx looking memory in same texture location improve cache hits.
Comments
Post a Comment