On SPARC with cbcond we can fail with:
# Internal Error (output.cpp:1576), pid=3601, tid=15
# guarantee((int)(blk_starts[i+1] - blk_starts[i]) >= (current_offset - blk_offset)) failed: shouldn't increase block size
The customer hit the bug with the 32bit version on a virtual machine running on T-series. We didn't identify that machine as niagara so we used the default loop alignment (16 bytes instead of 4). I cannot reproduce the crash on 64bit, but the problem exists there as well.
It very hard to write a unit test for the fix (at least I wasn't able to), since the code must specifically trigger block rotation in loop to make sure that the loop header starts with a compare.