c++ - OpenCL - need to recommended structure -


i have 2 files each 1 has 10000 points each point has 2 double number x , y. need operation on of these points, have 10000 0000 operations (10000 x 10000).

first question: structure recommend? mean variable should pass kernel file?

i have write script , executed 1000 point files (1000000 operations), have put points in 1 array (1000000 x 4) - 4 came x,y first file , x,y file - , passed kernel had 1000000 parallel threads.

local_item_size = 125 global_item_size = 1000000 

second question: think can improve structure , how?

third question: script have written working correctly 1000 points files when run 10000 point files faced cl_createbuffer error (cl_invalid_buffer_size 100000000 * 4double input array). think (but not sure) reason huge number of generated threads (100000000)!!

update: - hardware (intel(r) core(tm) i5-4570 cpu @ 3.20ghz, nvidia corporation gm204 [geforce gtx 980]). - have loop 1000 (3 ifs) operations each point, these operations done in kernel , result on each point independent other points.

update2: simplify problem - need multiply 2 matrix , b, has 10000 rows , 2 columns , b has 2 rows , 10000 columns best structure this?

thanks in advance,

regarding update 2: best way of handling matrices storing them in row-order-column. need 2 matrices 20000 elements each. in matrix elements stored 10000 elements per row , 2 rows altogether. in matrix b gives 10.000 rows, each row 2 elements.

take @ profile blog. there (german) tutorial opencl based matrix multiplication.


Comments

Popular posts from this blog

android - MPAndroidChart - How to add Annotations or images to the chart -

javascript - Add class to another page attribute using URL id - Jquery -

firefox - Where is 'webgl.osmesalib' parameter? -