[BUG] apitrace (Windows) stopped working with LWJGL 3

Started by CoDi, February 10, 2016, 00:36:46

Previous topic - Next topic

CoDi

Hi,

I'm working on a libGDX/OpenGL 3.2 desktop project. With the LWJGL 2.9.x backend I've been using apitrace on OS X and Windows to trace down pesky GL issues.

I've switched to the libGDX LWJGL3 backend recently. While apitrace continues to work fine on OS X, the traces on Windows are now incomplete, missing most of the calls. I also tried a build of lwjgl3-demos, with the same result: broken/incomplete traces on Windows. A native GLFW application works just fine.

My guess is that LWJGL does *something* apitrace doesn't like, causing traces to fail. Any idea why it would do that on Windows only?

abcdef

Do you get the same issues when you try to api trae the GLFW demo's? (ie not LWJGL but the actual demo's from the GLFW repository)

CoDi

That's what I meant with "A native GLFW application works just fine." Not one of the GLFW demos, but a few examples of another native application using GLFW (https://github.com/floooh/oryol). With them, apitrace still works as expected.

spasi

Tracing works fine for me (Windows 10, latest msvc apitrace build). No calls are missing afaict and replay produces the correct output. Let me know if you have an idea on how to reproduce your issue.

CoDi

Hi Spasi,

thanks for testing. That's odd indeed. I went back to square one, built LWJGL from sources, and tried to run one demo with apitrace:

apitrace-msvc\x64\bin\apitrace trace --output test.trace java -cp bin\Core;bin\Tests;libs;modules\core\src\test\resources org.lwjgl.demo.nanovg.ExampleGL3


As you said, this works perfect. ???

Next I built lwjgl-demos again, with a slightly modified pom.xml to create a JAR with dependencies:

apitrace-msvc\x64\bin\apitrace trace --output test.trace java -cp target/lwjgl3-demos-0.0.1-SNAPSHOT-jar-with-dependencies.jar org.lwjgl.demo.opengl.fbo.MultisampledFboDemo


This runs, but creates broken traces. I also tried SpaceGame, which creates broken traces too, but also shows heavy delay during GLFW initialization.

CoDi

Update:

If I adjust the class path to grab LWJGL classes/libs from source, the lwjgl3-demos variant works, too:

apitrace-msvc\x64\bin\apitrace trace --output test.trace java -cp d:\projects\thirdparty\lwjgl3\bin\Core;d:\projects\thirdparty\lwjgl3\libs;target\lwjgl3-demos-0.0.1-SNAPSHOT.jar org.lwjgl.demo.opengl.fbo.MultisampledFboDemo


If I build with 'ant release', and adjust the classpath to use those folders, its broken again:

apitrace-msvc\x64\bin\apitrace trace --output test.trace java -cp d:\projects\thirdparty\lwjgl3\bin\RELEASE\jar\lwjgl.jar;d:\projects\thirdparty\lwjgl3\bin\RELEASE\native;target\lwjgl3-demos-0.0.1-SNAPSHOT.jar org.lwjgl.demo.opengl.fbo.MultisampledFboDemo


Kai

UPX is to blame.

I get the exact same results on Windows 7 x64.
With non-UPX'ed lwjgl.dll it works perfectly (either DLLs in directory classpath or in JAR) and I am able to trace and replay every demo.
When I UPX it (like the ones in Maven) it ceases to work.

EDIT:
I think what apitrace does is to read the PE/COFF file of a client program/DLL (which is identical in memory when loaded, as on file) and tries to identify the OpenGL calls and patch the function entry points/addresses in the client to use its own wrapper. This might be because Windows does not have "next symbol" linking, like Linux has.
With the packed layout of UPX, this might not work, or maybe UXP is unpacking _after_ apitrace reads the PE/COFF structure.
But this is just speculation. Have to see the sources to know for sure.

EDIT2: Yeah, like I thought, it does exactly that. Scan the PE/COFF structure of the DLL to patch certain calls: https://github.com/apitrace/apitrace/blob/master/inject/injectee.cpp

CoDi

Nice, thanks for pointing that out, Kai. I didn't realise UPX is used.

So, do I get it right that "ant release" doesn't use the locally compiled DLLs by default? I didn't find UPX anywhere in the repository.

May I ask if there's a practical reason for doing that? Not just because of the tendency of wide-spread customer's AV scanners to false-flag compressed binaries now and then. I'd personally stay away from using executable packers in libraries.

Kai

Have a look at the native platform ant build xml files (for Windows this is: https://github.com/LWJGL/lwjgl3/blob/master/config/windows/build.xml#L187).
UPX is not used by default, only when you specify the build/system property "org.lwjgl.upx", and of course have UPX in PATH.

I think the only reason for using UPX is to minimize the download size of the native libraries by about 50 percent.
I do agree that either UPX should not be used at all, or LWJGL 3 could provide both a "Debug/Profile/Analyze" build of the library as well as a "Release" build, where the former would not use UPX but be a bit bigger in file size, and the latter would use UPX.

I am certainly in favor of not using UPX as well. That's why with released applications/code I always use a custom-built lwjgl library.
But of course you can always argue, when a certain anti-virus scanner flags the upx'ed library, that this particular scanner is bad. And you can always try to argue, if a certain introspection application stops working with a upx'ed version, that this certain application is badly implemented.
But this won't stop those things from not working. :)

spasi

I wouldn't mind getting rid of UPX if it keeps causing issues. In this particular case though, I tried UPXing the lwjgl dll and seen no difference; apitrace continues to work. Also tried downloading the nightly build, same. Are you sure you didn't change anything else? Keep in mind that with the way LWJGL works, there are no "patchable" entry points or function addresses. apitrace would never be able to find something like "glEnable" in there, the function address names come from Java code (passed to GetProcAddress). If apitrace patches anything, I would assume that would be the OpenGL driver dll.

Kai

This is weird.

What I did concretely was to package the lwjgl3-demos demos in a JAR, explicitly without including any LWJGL/SWT or other dependencies, except for JOML.
Then I used the VS2013 x64 command line environment to compile and link a new fresh lwjgl.dll (64-bit) using simple "ant" inside LWJGL's cloned repository.
This correctly produced a 64-bit lwjgl.dll (checked with Dependency Walker).
Then I built myself an invocation of "java.exe" with "-cp" to include the previously built lwjgl3-demos jar and also the libs/ folder and bin/Core folder of the lwjgl repository clone.
This works perfectly with apitrace and I could replay the trace.

Next thing I did was to go into the libs/ folder of lwjgl3, put UPX on PATH and invoked it with "upx -9 lwjgl.dll".
After that I tried apitrace with the same "java.exe" call like above, and it produced a completely broken trace, which could not be replayed again.
I also have the same issue with the Maven-downloaded lwjgl-platform jar and its included lwjgl.dll.

spasi


Kai

Almost all of them. But now I am using the simplest possible, which is `org.lwjgl.demo.opengl.shader.ImmediateModeDemo`.
I can dropbox you all of the files, if you want.
Also, with the upx'ed version of lwjgl.dll I get the following apitrace output:
d:\downloads\apitrace\build\MinSizeRel>apitrace trace --output my.trace java -jar lwjgl3-demos.jar
apitrace: loaded into c:\Programme\Java\jdk1.8.0_74\bin\java.exe
apitrace: warning: caught exception 0xc0000005
apitrace: tracing to my.trace
apitrace: unloaded from c:\Programme\Java\jdk1.8.0_74\bin\java.exe


Whereas with the non-upx'ed version (which works and can be replayed) I get the following output:
d:\downloads\apitrace\build\MinSizeRel>apitrace trace --output my2.trace java -jar lwjgl3-demos.jar
apitrace: loaded into c:\Programme\Java\jdk1.8.0_74\bin\java.exe
apitrace: warning: caught exception 0xc0000005
apitrace: tracing to my2.trace
apitrace: warning: unknown function "glMultiDrawArraysIndirectBindlessCountNV"
apitrace: warning: unknown function "glMultiDrawElementsIndirectBindlessCountNV"
apitrace: warning: unknown function "glCreateStatesNV"
apitrace: warning: unknown function "glDeleteStatesNV"
apitrace: warning: unknown function "glIsStateNV"
apitrace: warning: unknown function "glStateCaptureNV"
apitrace: warning: unknown function "glGetCommandHeaderNV"
apitrace: warning: unknown function "glGetStageIndexNV"
apitrace: warning: unknown function "glDrawCommandsNV"
apitrace: warning: unknown function "glDrawCommandsAddressNV"
apitrace: warning: unknown function "glDrawCommandsStatesNV"
apitrace: warning: unknown function "glDrawCommandsStatesAddressNV"
apitrace: warning: unknown function "glCreateCommandListsNV"
apitrace: warning: unknown function "glDeleteCommandListsNV"
apitrace: warning: unknown function "glIsCommandListNV"
apitrace: warning: unknown function "glListDrawCommandsStatesClientNV"
apitrace: warning: unknown function "glCommandListSegmentsNV"
apitrace: warning: unknown function "glCompileCommandListNV"
apitrace: warning: unknown function "glCallCommandListNV"
apitrace: warning: unknown function "glGetInternalformatSampleivNV"
apitrace: unloaded from c:\Programme\Java\jdk1.8.0_74\bin\java.exe


In this example, the lwjgl3-demos jar contains everything, including the DLL files, which I just swapped/exchanged between the two calls.
Although that access violation exception is a bit suspicious, but the non-upx'ed version works still. :)

QuoteKeep in mind that with the way LWJGL works, there are no "patchable" entry points or function addresses.
I know that LWJGL does not statically link against all possible OpenGL and extension functions. But there is GetProcAddress(). It is likely that this "injectee" code of apitrace fails to identify and intercept/delegate the call to GetProcAddress, which would also explain why apitrace reports some unknown Nvidia-Functions in the working non-upx'ed version which LWJGL tries to lookup, and which apitrace will then not intercept/trace but just delegate-through to the driver.

spasi

Kai: Does the x86 build behave similarly?

CoDi: What Windows version are you on?

CoDi

Windows 10 64 bit. I can pretty much confirm everything Kai wrote, though I didn't try as many steps.

@Kai: Some access violations seem rather normal for Java applications, sadly. I've also seen them before, when I tried to debug some issue with libGDX' packr.