Release of our so-called R-per-block approach to histogram computation in CUDA.
The code has been used in "An optimized approach to histogram computation on GPU", published in Machine Vision and Applications, 2012.
DOI: 10.1007/s00138-012-0443-3
Download zip file (updated 2012-07-26)

Release of our performance model for CUDA streams.
The code has been used in "Performance models for asynchronous data transfers on consumer Graphics Processing Units", published in Journal of Parallel and Distributed Computing, 2012.
DOI: 10.1016/j.jpdc.2011.07.011
Download zip file (updated 2012-08-23)
Release of our implementation of the Generalized Hough Transform (GHT).
The code has been used in "Load Balancing Versus Occupancy Maximization On Graphics Processing Units: The Generalized Hough Transform as a Case Study", published in The International Journal of High Performance Computing Applications, 2011.
DOI: 10.1177/1094342010383998
Download zip file (updated 2014-03-01)