Oh hey, just found this by searching for the tag "2k", so this is a pretty late comment.
Did *not* expect this from a 2k. Nice work.
munch.py brings this down to 1724 bytes. If you reduce the volume of the noise (and I guess the square, too), then the IT214/215 packing should work well. Another trick is to make the noise, say, 4 bytes long, munch it (using a random pattern rather than that "MUNCH*PY|" loop), and then changing the start pointer + length in a hex editor. "the long and the short of .it" uses this trick, although it's pre-munch.py so I skipped that step and just manhandled the file in a hex editor (I even overlapped the sample headers by 16 bytes - you don't have to worry about this though).
I'm pretty sure you can squeeze a few extra bytes out of this by packing the patterns a bit better - if you know how IT's pattern packing works, this really helps.
Also, Axx isn't the only way to slow stuff down, the SEx/S6x effects can help, and if you take advantage of the packing this can help.
Finally, I managed to reduce THAT entry to 2425 bytes by utilising munch.py properly. Which is still larger than 2k, but oh well. Maybe one day I'll get it below that barrier. One day.