SIGSEVG When Calling STBTruetype.stbtt_MakeCodepointBitmapSubpixel

Started by aaron.santos, May 28, 2016, 04:58:44

Previous topic - Next topic

aaron.santos

Hi, I'm using LWJGL3 build 87 and using STBTruetype.stbtt_MakeCodepointBitmapSubpixel to build a character atlas. I know stbtt_BakeFontBitmap exists, but I have positioning requirements that aren't covered by it and no matter, I should be able to use stbtt_MakeCodepointBitmapSubpixel to draw glyphs to my atlas.

I try to draw a few dozen glyphs and encounter three problems.

First after drawing 20 or so glyphs,  stbtt_MakeCodepointBitmapSubpixel fails to write anything. I ran a test where I draw each glyph to tiny image and use STBImageWrite.stbi_write_png to save the intermediate results. After a certain point, all subsequent glyphs render as black images.

Second, I notice that  stbtt_GetGlyphBitmapBox returns a strange value for y0 starting with the glyphs that fail to draw. This results in incorrect atlas placement. I'm not sure if this is related, but I felt that I should include it.

Third, randomly I get SIGSEGV errors when calling stbtt_MakeCodepointBitmapSubpixel during this process.

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007effcc06a6dd, pid=13748, tid=139638580479744
#
# JRE version: Java(TM) SE Runtime Environment (8.0_25-b17) (build 1.8.0_25-b17)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.25-b02 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# C  [libjemalloc.so+0xb6dd


The stack looks like this
Stack: [0x00007f0023ec8000,0x00007f0023fc9000],  sp=0x00007f0023fc5270,  free space=1012k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
C  [libjemalloc.so+0xb6dd]

Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
j  org.lwjgl.stb.STBTruetype.nstbtt_MakeCodepointBitmapSubpixel(JJIIIFFFFI)V+0
j  org.lwjgl.stb.STBTruetype.stbtt_MakeCodepointBitmapSubpixel(Lorg/lwjgl/stb/STBTTFontinfo;Ljava/nio/ByteBuffer;IIIFFFFI)V+35
j  zaffre.font$char_image.invoke(Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object;+358


If I use this jvm arg
"-Dorg.lwjgl.system.allocator=system
, then I get this error (again randomly) instead
*** Error in `java': double free or corruption (out): 0x00007f29659cb580 ***


Is there any more information that I can provide that would help in finding what I could be doing wrong? Any caveats on calling these methods?

spasi

I would be able to debug this with a simple code sample that reproduces the issue. If that's not possible, please share more information on how your code is set up.

Going through the code, it does look like calling stbtt_MakeCodepointBitmapSubpixel with an unsupported codepoint could lead to freeing a NULL pointer. Could you please verify that stbtt_FindGlyphIndex returns a valid glyph index for all codepoints you try to render?

aaron.santos

I'll work on getting a minimal code sample. My project is in Clojure, so I'll have to port to Java for easy reading.

Right now, the code loops through 0x0000 to 0xFFFF calling  stbtt_FindGlyphIndex and checking if the result is > 0 in order to establish which codepoints are present in the font before rendering them. I realized this morning that I'm essentially throwing away the glyph index. It would be more efficient to keep it around and use the stbtt Glyph functions instead of the Codepoint ones. I made the change to use glyph indices everywhere but the issues remain.

aaron.santos

What I've done is logged the calls to STBTruetype in my code and used that log to generate Java code that makes the same calls with the same arguments. I wish I had more to add, but the test succeeded, so there's some difference between the test and the original that I've failed to capture.

For completeness, the test code https://gist.github.com/aaron-santos/66288491afaa2727265639b2d4f70403.

spasi

You could share the original Clojure code if you want.

Btw, that way of allocating buffers using BufferUtils in a loop is very expensive. You should reuse buffers or have a look at the MemoryStack class.

aaron.santos

Thanks for the BufferUtils tip. You're right, it is pretty slow.

The code is admittedly a work in progress. If you want to take a look at it before I make those changes, you can see it here https://github.com/aaron-santos/zaffre/blob/master/src/zaffre/font.cljc#L363. The intent is that the glyph-graphics method renders all of the glyphs in font into a texture that can later be passed on to the GL side of things. It assumes a fixed width font (which is fine for my use case) and that there won't be a million glyphs (also fine for my use case). There's also a LOT of unnecessary copying going on because I'm still in the process of working out glyph positioning and want to examine the intermediate results.

The steps to run the example I'm testing with look like this
git clone https://github.com/aaron-santos/zaffre.git
lein run -m examples.ttf


The project uses leiningen as the build tool (available @ http://leiningen.org/).

aaron.santos

Just to give a little bit of an update on the progress here, I'm starting to believe that the root cause lies somewhere in code unrelated to rendering glyphs. I took the time to create a test app by stripping everything from my project except the STBTruetype-related calls. And...it works. I'm at a loss as to how unrelated code could impact this but stranger things have happened.

This is my test code based on my original project https://gist.github.com/aaron-santos/f4266f3e825a2f680fe02bd60d16e209

spasi

I encountered a similar issue while working on the nuklear bindings and might have a solution for you:

Keep a strong reference to the DirectBuffer that you create in make-font and pass to stbtt_InitFont.

I have had the same issue with NanoVG (see the javadoc on nvgCreateFontMem), but I thought it was an internal issue of NanoVG. It turns out that it's an stb_truetype issue (NanoVG uses it for its font rendering). The stbtt_fontinfo struct is very simple and does not copy any data from the font buffer, it only has pointers to it. So when the buffer is garbage collected, the pointers point to invalid memory. The reason your normal program crashes and your stripped test does not is likely because the former triggers GCs, but the latter is too simple to need one.

aaron.santos

Ahhh spasi you've got it! I separated out the font data buffer creation from the creation of FontInfo like so

(defn font-data
  [x]
  (if-let [font-path (as-font-path x)]
    ;; Load font from file
    (let [info         (STBTTFontinfo/calloc)
          buffer       (-> font-path
                         jio/input-stream
                         IOUtils/toByteArray
                         ByteBuffer/wrap)
         direct-buffer (BufferUtils/createByteBuffer (.limit buffer))]
      (doto direct-buffer
        (.put buffer)
        (.flip)))
    (throw (RuntimeException. (str "Font " x " does not exist")))))

(defn make-font
  [font-data]
  (let [info (STBTTFontinfo/calloc)]
    (if (zero? (STBTruetype/stbtt_InitFont info font-data))
      (throw (RuntimeException. "Error loading font"))
      info)))


And then ensured that they have the same lifetimes like this

    (let [font-data                  (font-data name-or-path)
          ^STBTTFontinfo font-info   (make-font font-data)


And my crashes went away.

Thank you for all your help and your patience. This was a real head scratcher for me until you brought up the fact that FontInfo just keeps pointers to the buffer, then it clicked.