Implementation and Comparison of the k-means clustering algorithm in OpenCV, pure C, C + OpenMP and C + CUDA. Parallelization with both OpenMP and CUDA results in demonstrable speedup over the non-parallel C implementation. Includes demo code and benchmark results.