Tidy-up CUDA implementation of IProductWRTBase and add CUDA kernels with additional parallelism (!73) · Merge requests · Nektar / redesign-prototype