I can't give specific advice without seeing a real application and what it does. But, depth testing can work nicely, even with 2D rendering. Sort opaque or alpha-tested sprites front-to-back, enable depth writes, enable depth-test. Disable depth writes for blended sprites, but keep depth-test enabled. You'll save bandwidth on any overlapping sprites.
But I really think you shouldn't make any assumptions about performance based on this test. A real application with 2D rendering, with normal sprites without that much overdraw (i.e. smaller sprites, distributed evenly across the screen, with enough overlap to make it interesting) would not have this performance profile. The bottleneck would be elsewhere (usually render command submission).