Skip to content

Duff Light

July 19, 2011

I recently installed MinGW and immediately regretted not having done so earlier. It was probably my confusing it with Cygwin. Or perhaps my bad experiences with GNU things ported to Windows. They tend to be bloated out the wazoo, and require a bizarre amount of auxiliary DLLs or other stuff.

Not so this time. Another pleasant surprise is that the code it generates is not too bloated either. My (extremely minimalist) programs came out less than half the size from when I used Borland’s compiler.

The whole reason I installed this at all was because I couldn’t find a tool that did what I wanted to. My google fu is weak so I no doubt missed a great many programs but eh. I considered making a Powershell script or something but had no idea exactly how to go about it (not that I gave it a whole lot of thought) and it would have been slo-o-o-w as Powershell scripts are wont to be.

The functionality I was looking for was to make a tree list of a directory. The listing would use indentations to convey level and would after each file or folder name print the size of that item. If it was a folder, it would show the total size of all files and folders contained in it. The program would of course therefore recurse through all subfolders until every file was found and listed.

I found a program that almost did what I wanted. It didn’t list files, though, only folders.

So I wrote up a treewalker function (with a little copy-pasting from previous, similar endeavours) and everything turned out really neat. A nice bonus was that MinGW provides you with a whole bunch of different findfirst()s and related. Which means that to get at the 64-bit file length I wouldn’t need to call any special WinAPI stuff, at least not explicitly (and that’s what counts, really).

Man, 64-bit integers are suh-weet. Yeah, Borland’s compiler handled that too, I know, but when I started programming I used stuff like Turbo C under DOS. Pain. Ful. And since then I’ve mostly dabbled in assembler so it’s still very much an untapped luxury for me.

Anyways, so the program was quickly written and then tweaked for inaccessible directories (I had to change error code from -1 to 0 not to upset the total, having first not considered things like user permissions; thanks a lot, operating system from this millennium). It worked like a charm but there was the problem of reading the file sizes. Stuff in the kilo- or megabyte range is easy enough but when you get to giga it becomes hard to take in at a glance.

So I decided some digit grouping was in order. I read around and saw that some compilers had incorporated that function into printf(). Alas, MinGW didn’t seem to be one of them. Seemed I had to do it myself. Sadly I’m not very good at seeing an elegant solution at first, and I tend to stumble around until I find a way to brute-force and only then can it approach being non-horrible.

For some reason I started to think about Duff’s Device. If you don’t know C, you’ll shrug nonchalantly and go “So?” but if you do know C and have never heard of the Device before, you’ll probably find it profoundly weird. Seeing it for the first time, most people won’t believe it would run, and if it did run, that it wouldn’t be portable, that it would be some compiler-specific quirk. However, it isn’t; it’s a perfectly legal and proper piece of code that any compliant compiler must accept without grumble.

After some trial-and-error I came up with this isotope of Duff:

char temp[64], temp2[64];
char *p = temp, *q = temp2;
int i;

_i64toa(size, temp, 10);

switch(strlen(temp) % 3)
        *q++ = ' ';
        case 0: *q++ = *p++;
        case 2: *q++ = *p++;
        case 1: *q++ = *p++;
    } while (*p);

*q = '\0';

Me being me, I had to do some testing to find out where the space insertion was supposed to go.

What’s interesting is that I actually had some old code for this very problem in some other program but I decided to find some more elegant. And I do think it’s elegant, no matter how abominable Duff’s Device may seem.

The only alternative I could come up with involved some sort of two-part deal where you first dealt with the part at the beginning which would or would not be an incomplete 3-digit group and then dealt with the rest in a second phase. And the old code involved all kinds of silliness like string reversals and such.

I was quite pleased with my solution, trivial though it may be. I was even more pleased when I made better an old program of mine for fixing the artifact of directory recursion that is the listing being “upside down.” I had toyed with the idea of my tree size program building a list in memory before outputting it but instead I decided to make more robust an old program.

That program reads lines of text from either a file or from stdin and output to stdout those same lines but in reverse order. It was actually a remnant from when I did the programming exercises in the book The C Programming Language and as such had an arbitrary limit of 5000 on how many lines it could handle. This proved far too small for the 3+ megabyte text files I’d generated.

I decided to make it more dynamic in terms of what it could handle so I made it read lines into a linked list, adding new lines to the head instead of the tail so that the printing function could unroll the list like normal. This too felt very elegant.

All in all, it was a good day for programming.


From → General

Leave a Comment

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: