Hi I didn't tried 256x192x256.
What I tried is several experiments with different cube size(NxNxN),different cuda version and different niftyrec version, there is my results:
Niftyrec 166, Cuda 4.2 64bit, matlab 64bit, N = 64,128,192 OK, 256448 Constant output(same as on previous picture), 512  error due the memory allocation(seems that memory requirement was higher than my 4GB on the card)
Niftyrec 168, Cuda 4 64bit, matlab 64bit, N = 64 artifacts in the image,N=128 OK, N=192 artifacts in the image, N= 256 matlab crush
Niftyrec 168, Cuda 5 64bit, matlab 64bit, N = 64,128 OK, N= 192 artifacts in the image, N= 256 matlab crush
Do you still want the 256x192x256 experiment? My codes does not work with non cubic volume. Do you have working example?
I hope this will be usefull for you.
