Oddly enough, I went home for the day that day, came back and it ran just fine, no errors or anything (this is a bit disconcerting but oh well, I can't seem to reproduce it now).
#unrelated
On a side note, I finished the OpenCL program and I was blown away at the results! Previously we had an engineering calculation/algorithm that took about 8 hours to run on the latest i7 (and we had to run that calculation 4-16 times for each data set). My first approach was to convert it to the Java Streams API and let it run concurrently, and I was able to reduce the time per calculation to 2 hours, way better, but still not good enough. Then I decided to use OpenCL and thus ended up on this forum, and the calculation now takes about 30 seconds to execute (on a 1080Ti with 3500 cores). Absolutely crazy! I was certainly not expecting to see that big of a jump. Thanks for the help everyone!