@kappa: Thanks for the 'repost'
@Spasi: Thanks for your reply.
It's not so much a bug, but I'm a huge advocate for fail-fast behaviour when something smells fishy.
Just look at how enhanced-for in throws a ConcurrentModificationException when you try to modify the collection you're iterating over. It doesn't mean the developer did something wrong, it means that the developer did something that will probably result in unexpected behavior.
"Garbage in, garbage out" (and the resulting silent failures) is IMHO among the worst ideas ever. In Java we should adhere to "Garbage in, raise exception" to prevent problems to escalate.
Say, when you're on a 64 bit system and the buffer has a number of remaining bytes that is a multiple of exactly 4, you can be fairly sure the developer made a mistake, which is likely to cause a nasty access violation later on (we're dealing with passing pointers here). On both 32 and 64 bit systems, the developer must manually handle the width of the pointer, which usually ends in disaster with the 'works on my machine' mentality.
It's the same for any other remaining amount of bytes, not being a multiple of 8... but 4 is especially worrying as it's almost right and hence very likely to be very wrong. I consider the clipping of remaining bytes to be a silent failure, because there was data there, and we decide to ignore it, without telling the developer.
I don't think 'suboptimal performance' is an innocent issue either, when you're dealing with OpenCL, which is all about cranking the most out of the available hardware. Unaligned memory access on non-x86/x64 CPUs is at least an order of magnitude slower, AFAIK.
Regarding the need for GetDirectBufferAddress, that's exactly what I meant.