Weird behaviors when using the JRE

Started by nbilyk, April 01, 2016, 15:40:02

Previous topic - Next topic

nbilyk

I have yet to fully isolate my problem, but I was hoping somebody could point me in the right direction.

In the JRE my LWJGL application renders incorrectly, where it's as if only the first draw call of a frame is working. It looks like only part of the scene is rendered and the rest comes out as garbage.

It runs perfectly when using the JDK (1.7), but with any JRE (I tried 1.6, 1.7, oracle and openjdk) it feels as if my buffers aren't fully making it to the graphics card unless I use glFlush().

I'm creating direct buffers the way LWJGL recommends (BufferUtils.createFloatBuffer(capacity))

My draw looks something like this:

int indicesCount = indices.position()
vertexComponents.flip()
GL15.glBufferData(GL15.GL_ARRAY_BUFFER, vertexComponents, GL15.GL_STATIC_DRAW)

indices.flip()
GL15.glBufferData(GL15.GL_ELEMENT_ARRAY_BUFFER, indices, GL15.GL_STATIC_DRAW)

GL11.glDrawElements(GL11.GL_TRIANGLES, indicesCount, GL11.GL_UNSIGNED_SHORT, 0)


If I add glFlush() after this code, things work (albeit slowly).

Other things to mention:
Windows 7, happens both with an Intel integrated gfx and nvidia gpu
I'm using Kotlin, and haven't tried to isolate yet if my bug is reproducible with plain Java. I don't think they do anything fancy as far as threading or buffers go.
The problem exists with LWJGL 3 nightly, stable, and an older version of 3 from last year. Again, only on the JRE, not the JDK.



spasi

Could you share a minimum example that can be used to reproduce the issue?

nbilyk

Yeah, I'm working on trying to isolate the issue, it's been tricky, my test beds don't reproduce the problem so far. I was wondering if anybody knew of something I could look for that would specifically be different from the JDK vs the JRE.

Kai

Technically, there is not a single Bit of a difference between the launcher java/javaw in the jdk/bin folder and the launcher java/javaw in the jdk/jre/bin folder or the jre/bin folder. They are exactly identical except for the digital signature timestamp in the exe metadata.
They also load the same jvm.dll/libjvm.so, because the jdk does not have a jre, it uses the jre from ../jre.
So it must be something else. Different classpaths for different run configurations?

nbilyk

QuoteTechnically, there is not a single Bit of a difference between the launcher java/javaw in the jdk/bin folder

Yeah, I'm just as perplexed as you...
I can have exactly the same classpath and jvm arguments and the jdk exe works, where the jre exe doesn't...

(ignore the line breaks, that's just from copying from cmd prompt)
Works:
C:\Projects\MatrixPrecise\HTML5\MpApp\out-win>"C:\Program Files\Java\jdk1.7.0_40\bin\java.exe" -ea -
cp AcornUtils_jvm.jar;C:/Projects/kotlin/AcornUi/dist/AcornUiCore_jvm.jar;Common_jvm.jar;Ddc_jvm.jar
;ThemeBuilder_jvm.jar;C:/Projects/MatrixPrecise/HTML5/MpApp/Shell/core/dist/MpAppCore_jvm.jar;C:/Pro
jects/kotlin/AcornUi/dist/AcornUiLwjglBackend_jvm.jar;lwjgl.jar;C:/Projects/MatrixPrecise/HTML5/MpAp
p/Shell/jvm/dist/MpAppJvm_jvm.jar;kotlin-runtime.jar mpapp.jvm.MpAppJvmKt
[INFO] LWJGL Path: C:\Projects\MatrixPrecise\HTML5\MpApp\out-win\assets\native

Doesn't work:
C:\Projects\MatrixPrecise\HTML5\MpApp\out-win>"C:\Program Files\Java\jdk1.7.0_40\jre\bin\java.exe" -
ea -cp AcornUtils_jvm.jar;C:/Projects/kotlin/AcornUi/dist/AcornUiCore_jvm.jar;Common_jvm.jar;Ddc_jvm
.jar;ThemeBuilder_jvm.jar;C:/Projects/MatrixPrecise/HTML5/MpApp/Shell/core/dist/MpAppCore_jvm.jar;C:
/Projects/kotlin/AcornUi/dist/AcornUiLwjglBackend_jvm.jar;lwjgl.jar;C:/Projects/MatrixPrecise/HTML5/
MpApp/Shell/jvm/dist/MpAppJvm_jvm.jar;kotlin-runtime.jar mpapp.jvm.MpAppJvmKt
[INFO] LWJGL Path: C:\Projects\MatrixPrecise\HTML5\MpApp\out-win\assets\native

If the JDK exe literally uses the JRE I wonder if maybe I have some weird settings in my graphics card options. I'll pour through those some more.

What makes this even weirder is that GLIntercept shows screen shots as if everything is working just fine...
(And yes, I considered that GLIIntercept itself might be the problem, but I disabled it, so it doesn't seem like that's the culprit)

I've tried making a testbed but can't seem to replicate the issue in an isolated environment.


Kai

If you want to check which DLLs and resources and files get loaded by a process, to ensure that both processes are identical on the process level, then I recommend Sysinternal's (now Microsoft) Process Explorer:
https://technet.microsoft.com/en-us/sysinternals/processexplorer

This will show you under the DLLs Lower Panel that both processes use the DLLs of jre/bin and the Handles Tab will show you that both processes load the JAR files and other resources from jre/lib.

Maybe you find some differences there.

nbilyk

I did find differences.

The jdk exe ran these dlls that the jre exe did not.

crypt32
d3dx10_40
igd10umd64 vs ig7icd64
msasn1
ntmarta
nvoglv64
winsta
wintrust
wldap32
wtsapi32

The ones that look interesting to me are d3dx10_40 and nvoglv64 (nvidia).  My nvidia drivers are up-to-date.

Kai

Can you verify that both programs run using the Nvidia GPU?
Maybe via glGetString GL_VENDOR.
Because currently it seems that the jdk/bin/java.exe uses the Nvidia GPU while the jre/bin/java.exe uses the Intel integrated GPU. And that makes Optimus seem to be the problem. You could try and disable it in the BIOS.
Maybe you configured the jdk/bin/java.exe in the Nvidia control panel to use the Nvidia GPU at some point in the past?

nbilyk

We're getting somewhere :)

Vendor: Intel -- Doesn't work
Vendor: NVIDIA Corporation -- Does work

I had tried switching before - it was one of the first things I thought to check, but apparently there are some weird rules for which one gets used, and the profile for java.exe doesn't seem to apply.
Now that I'm logging the Vendor, I can tell now which one I'm using.

So now that we know it's related to the Intel card, what now? :)

nbilyk

I should mention that I know for sure using the Intel card works with other LWJGL games. So either LWJGL 3 has something broken, or I've got a mistake somewhere in my code. I'll try fiddling with my shaders.

I can't seem to replicate the issue with the testbed using the Intel gfx card. I wonder if there is a limit for something I'm surpassing...

Kai

My experience with Nvidia drivers is that developing and testing a program on Nvidia and then trying to execute it on AMD or Intel is almost guaranteed to not work. Nvidia's drivers are far too lenient in what they allow, Intel's drivers are buggy as hell and AMD's drivers adhere very strictly to the spec and allow nothing else, even if they could.
One example: Using linear filtering on depth textures works on Nvidia, but does not work on Intel (no matter the feature level).

Another possible cause could be that you are using a feature level (OpenGL version, shader model) that is supported by your Nvidia card but not by your Intel card. Maybe you did not specify an explicit core profile in context setup so you get the highest possible GL version for both cards, respectively, which may for the Intel card not be what you programmed against.

So, always check for GL errors, either with GL11.glGetError() or even better with a debug context and a debug message callback. And always check for shader compilation warnings/errors and shader program linking warnings/errors.

nbilyk

Hurray, I finally have a test bed reproducing the issue!

The problem only occurs when using a shader, so I'm assuming I'm doing something really dumb with the shader.

https://gist.github.com/nbilyk/32a2598ccf262c7a4e663e6aae0224a6

Kai

Quote from: Kai on April 02, 2016, 11:04:54
So, always check for GL errors, either with GL11.glGetError() or even better with a debug context and a debug message callback. And always check for shader compilation warnings/errors and shader program linking warnings/errors.
// Add this to GLFW window initialization
glfwWindowHint(GLFW_OPENGL_DEBUG_CONTEXT, GLFW_TRUE);
...
// And do this after glfwMakeContextCurrent(window)
Closure debugProc = GLUtil.setupDebugMessageCallback();

And observe the likely error messages you are getting on stderr.

nbilyk

Nice, I get a real error message now!

[LWJGL] OpenGL debug message
	ID: 0x20071
	Source: API
	Type: OTHER
	Severity: NOTIFICATION
	Message: Buffer detailed info: Buffer object 2 (bound to GL_ELEMENT_ARRAY_BUFFER_ARB, usage hint is GL_STREAM_COPY) will use VIDEO memory as the source for buffer object operations.
indicesL 0

Kai

That's not an error, it is a notification. So, your error must be somewhere else not related to GL errors.