Yep everybody does it; repetition. In the grand old days of old we all rebuilt the wheel leading to the establishment of patterns. For me patterns often help me get to achieve my goals faster than using Google. Using Google to find simple tools normally means understanding the search engine, browsing the results and evaluating each alternative in turn. Invariably this means slightly altering someone else’s code, compromising objectives or installing a bulky tool that is bloated by a thousand functions you don’t need or want. Naturally this last list is not exhaustive; just consider the dependencies that an external solution often requires. If its simple and quick its often better to reinvent your wheel.
A perennial favuorite of mine is base translation; specifically hexadecimal to binary and binary to hexidecimal. This is so simple and so useful it’s the subject of thousands of Google results. Yet the useful bits are rarely on the first page of results. So I coded it in about 30 minutes, and debugged it in about 30 minutes – because I was tired at the time. Interestingly this is not the first time that I’ve written this code. My earliest recollection of having written it was about 1986 over 22 years ago. Of course it was coded in basic at the time on an Atari, however it wasn’t too long before I started translating it into C.
Patterns have long been a formal word in Computer Science. In 1996 a very famous book, the name of which escapes me now and I’m too lazy to get it off my shelf, was dedicated to the development of patterns in software development. Interestingly the way that the patterns were described was almost a deterent to the patterns themselves. Being of academic nature the usefulness of the book was somewhat tarnished by its lack of pragmatic style and appeal to the great unwashed. Its a fantastic book though which every software engineer should read.
For me patterns are best expressed in real world examples and in the languages that I understand. It makes them instantly availble to me and much more likely to improve my work. Yes they do need some formal definition most of the time. However if its so simple as to be obvious then don’t clutter the example with an explanation other than to say what you used it for. If its really that useful to be made into a pattern it’ll resurface again in the form of ‘now what did I do with that code I wrote 22 years ago that solved this problem?’
Well here it is, I used it to translate a Hexadecimal network packet lifted from an application log back into binary so that it could be replayed over a network for testing the application time and again. Note the use of redirected standard input and output; its clean – no Swiss army knife in this one, what would I need a spoon on it for anyway?
hex2bin.c
#include "stdio.h"
#include "stdlib.h"
int main( int argc, char ** argv, char ** env ) {
char h = '0';
FILE * fi = stdin;
FILE * fo = stdout;
if ( argc > 1 ) {
printf( "Usage: redirect input and output to stdin and stdout respectively.\n" );
return 0;
}
h = getc( fi );
while ( h != EOF ) {
char b = 0;
if ( h - '0' < 10 ) b = h - '0';
else if ( h - 'a' < 'g' - 'a' ) b = ( h - 'a' ) + 10;
else if ( h - 'A' < 'G' - 'A' ) b = ( h - 'A' ) + 10;
b = b << 4;
h = getc( fi );
if ( h == EOF ) return 0;
if ( h - '0' < 10 ) b += h - '0';
else if ( h - 'a' < 'g' - 'a' ) b += ( h - 'a' ) + 10;
else if ( h - 'A' < 'G' - 'A' ) b += ( h - 'A' ) + 10;
if ( putc( b, fo ) == EOF ) return 0;
h = getc( fi );
}
return 0;
}
Of course in this case it was just too tempting to write it’s sister as well: binary to hex. If you combine these two tools on the command line with netcat and some other favourite script favourites you can make quite a useful test tool.
bin2hex.c
#include "stdio.h"
#include "stdlib.h"
int main( int argc, char ** argv, char ** env ) {
char b = '0';
FILE * fi = stdin;
FILE * fo = stdout;
if ( argc > 1 ) {
printf( "Usage: redirect input and output to stdin and stdout respectively.\n" );
return 0;
}
b = getc( fi );
while ( b != EOF ) {
char h = 0;
h = b >> 4;
if ( h < 10 ) h = h + '0';
else if ( h < 16 ) h = h + 'a';
if ( putc( h, fo ) == EOF ) return 0;
h = b & 0x0f;
if ( h < 10 ) h = h + '0';
else if ( h < 16 ) h = h + 'a';
if ( putc( h, fo ) == EOF ) return 0;
b = getc( fi );
}
return 0;
}
Lastly there is a point that is quite good to note about such simple patterns: they make really a really good basis for recruitment tests. You can always tell how good an organisation is in recruitment by the quality of this process. If they complain about spelling it generally means that the organisation is focussed on minutia and micro-management is probably the order of the day. Normally that’s a warning sign. So I always include spelling mistakes in my submissions. However if you are asked about the various ways of implementing the illustrated code (once of course you’ve rewritten it for them from requirements!) and integration strategies, impact on speed, memory usage along with testing and comparison of the requirements provided you’re probably to a good thing.
So just for fun let’s look at one aspect of this. IF you go do that dreaded Google search and sift through the cruft out there you will find a very specific approach often used that is quite different to that which I’ve used above to determine binary or hex value in translation; it normally involves using a switch statement like this:
char h;
int b;
switch ( h ) {
'0' : b = 0;
break;
'1' : b = 1;
break;
'2' : b = 2;
break;
'3' : b = 3;
break;
'4' : b = 4;
break;
'5' : b = 5;
break;
'6' : b = 6;
break;
'7' : b = 7;
break;
'8' : b = 8;
break;
'9' : b = 9;
break;
'A' :
'a' : b = 10;
break;
'B' :
'b' : b = 11;
break;
'C' :
'c' : b = 12;
break;
'D' :
'd' : b = 13;
break;
'E' :
'e' : b = 14;
break;
'F' :
'f' : b = 15;
break;
default : return -1;
}
All this seems reasonable. Yes its going to be bigger than my code but its going to be quicker right? Actually that’s not right. You see the code is larger so its going to take a larger number of get requests to the memory. My code is small enough to fit inside most modern processor’s register sets not even touching the processor cache which the static example just given would have to sit in. Next the compiler on average will make twice the number of comparisons to jump into the switch statement, than the corresponding number of operations for my version. But this is also compiler and processor dependent. Coding simply in this case makes the optimiser’s job easier but an optimiser normally doesn’t have the nous to translate the code to another approach, just apply obvious short cuts to the code already presented. Naturally this could lead into discussions about optimisers and their abilities and limitations.
In fact there are a large number of optimisations that could be made to my code that would improve its performance too. But I’ll stop here because I could write a book on this one piece of code. So you see a particularly innocuous, been-done-before, simple piece of code can be interesting and be used to draw out the depth of a person’s knowledge. Plus there’s a certain amount of satisfaction in being able to go back an polish old code, it’s like visiting an old friend.
November 11th, 2008 on 8:30 pm
Oh yeah, for those of you who think that this is the best way of doing things, here’s a hint:
printf(“0x%02x”,c);
Not every solution fits the same problem.