Compressing some js for the js1k comp

Lately there have been both a 1k and a 10k JavaScript competition. The 10k one is wayy too fancy to appeal to me, but I thought I might have a stab at the 1k.

In doing so, I made a few discoveries about methods to get your js code really small.

I saw this tetris entry which achieves ALL of tetris in 1024 bytes of JS. Goddamn! At first glance, the source is bizarre gobbledygook, but once you study it, it’s actually doing “real” compression: putting recurring bits of text into an array and substituting them into the main code (by searching replacing chars which were not otherwise used in the code). This was obviously done algorithmically, but it occurred to me you could do it manually and use slightly meaningful unicode chars, and end up with dense, quick to write yet readable code – and you didn’t have to compress it afterwards, it was ready to go (I did, in fact, write it with linefeeds in, but I didn’t have to do much)

e.g. you could replace the text “function(a,b,c)” with “ƒ”, and then just have myFunc=ƒ{alert('blah!)}. There’s enough whacky symbols out there in unicode to cover a lot of normal long words – I used ® for “return”, ß for “button”, £ for “EventListener” etc. A quick sample:

d=ƒ{if(e.ß!=undefined){stkTo(X,Y);nuPth(X,Y)}else{up}};\
↓=ƒ{nuPth(X,Y);D.add£('ⓜmove',d,0)};\
up=ƒ{D.remove£('ⓜmove',d,0)};\
t.lineCap='round';P();\
c.add£('ⓜ↓',↓,0);\
D.add£('ⓜup',up,0);\

I managed to fit the first version of my sketchy app into <1k this way. Check out the demo source – it's at least partly readable. Then, I managed to trade off a little more readability and get a version with undo.

But THEN, @getify submitted an entry in the comp which actually did @aivopaas’ compression for you. As I still had a feature or two I’d really like to include, I decided to end my readability experiment (but write it up somewhere – i.e. here) and go for maximum smallness, and here are my notes on achieving that.

The idea behind @getify’s compressor is you first minify it using some other tool – it doesn’t strip whitespace, shorten tokens or any of that jazz – and then compress it.

My code was already at a slight advantage because it contains repetitions where they don’t add functionality, but they do lend themselves to getting substituted out.

Starting with my “decompressed” code (i.e. code in the form I never actually typed it) I first tried Google’s closure compiler (just the web service, I didn’t build it myself or anything), but that had three problems.

First, it converted all my quotes to doubles (see footnote) which @getify’s code just passsed thru and it all broke – but I could fix that manually.

Second, it still put two linefeeds in my code – again, once I worked that out, manually fixable.

Lastly, however, the Closure compiler “fixed” my function declarations to have the right number of parameters – a trick I copied from a few js1k entries is to re-use the same function dec regardless of how many parameters the function might use, thus being able to use the one compressed token for all of them (the ƒ above).

Then I tried jsmin (again via a web service). While its compressed code was maybe 20 bytes (a whole 2%) larger than Google’s, the result out of @getify’s was smaller. My hand-squeezed 1022 bytes was now 954. But I could see that there were still improvements: @getify’s had separately tokenized “=function(e,x,y){” and “=function(e,x,y){t.” – where the former could be substituted into the latter to save 15-odd bytes! Also, the compressor, very cleverly, uses the ASCII characters below 32 as tokens so as not to have to use unicode chars (which are at least two bytes). The downside is, it can only have 31 tokens, so it picks the 31 best, but leaves room for improvement. Now, there are plenty of unused roman letters and symbols – Q and Z don’t appear, so as I added my remaining features, I added a few more hand substitutions to try to keep me below 1024. There are still a couple of repetitions (the word “button” the most obvious) and a couple of other tricks I used when I did then hand compression, so I know @getify’s could be improved and I wonder whether, once improved, it could get itself back under 1k again!?

But there’s another point here which is that, while it’s a very neat trick, there’s been just a little more development put into gzip, et al. I vote the next js1k comp limits you to 1k after gzipping. it would still reward tricks like repetitive function decs, but free us up from trying to reinvent the wheel, and including that wheel into our 1024 bytes.

I ended up fitting 2398 bytes into 1022 – admittedly that was starting with comments, which broke down like this:

Original code   JSMin’ed   @getify compressor   hand tweaks
2398 1442 1044 1022

The entry is now up, and here it is in action (again, owing a little something to those who came before it)

skech app doing a van gogh

skech app doing a van gogh

Update: Aivopaas releases his compressor, found a good minifier comparison page

@aivopaas has now put his JavaScript compressor up – it does a better job than @getify’s, as expected, and would reduce my entry to 988 Bytes.

I also found CompressorRater which runs your code thru 4 different compressors and lets you choose the one that does the best. (However my code, which has raw html output by document.write() calls, got mangled, at least I knew there was only a byte or to between the main ones)

Footnote on why you want only one kind of quote:

the way the compression works is to declare one big string that gets substituted into, so you want to use only one kind of quote in your code so you can quote it with the other – e.g. this is one string

"alert('dog'+'cat'+'ferret')"

which you can then eval(), if the quotes were all the same, you’d get a parse error.

Advertisements
    • Aivo Paas
    • November 18th, 2010

    Hey, please fix my name, it’s Paas, not Pass.
    Thank you.

    I don’t get why it’s so common mistake.

    Btw, I have updated my crusher so that it is itself under 1k and may save even more bytes (if it wasn’t updated already when you tried it).

    And of course my final version of tetris had even more features in it – most noticeable, localstorage top list 🙂

    • Sorry about that, fixed.

      I think the fact that the brain tokenises things into “a P, an A, an S and one letter’s doubled” and that “pass” is an English word may have something to do with it …

      Awesome compressor, deadly tetris, BTW

  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: