Blinded by the Wobbly Wobbly Web

Its really hard to make a great ascetically pleasing web site, which is probably why this one is so dull, mostly because of your monitor. You might think its rude of me to blame your monitor, perhaps it is. Or is it? Actually it looks really good on my 1080p wide screen monitor with its thousands of shades of black. It even looks really good on my wife’s cool little Mac Book. But it really sucks on the screens at work.

What I’m getting at here is not really how the monitor makes it look but the challenge that we as designers face in conveying our creations to the outside world through this uncooperative medium of the web. Today we constantly fight with:

  • Browser reproduction – is the box model in IE really broken, if so which one in which version of IE? Is Safari the benchmark or Opera? What about Firefox and Chrome, surely they could get it right?
  • Display characteristics – how do you know what black really looks like on my screen, or someone else’s? Will your carefully shaded images show the lovely gradients of a baby’s skin or will it be uglified by the course grades available? Are the dimensions just right? How about the resolution?
  • Plugin availability – all my browsers have flash block on them to stop Internet bandwidth thieves stealing what I’ve paid for with their extravagant advertising. Are all the fonts that you need there? JavaScript anyone? Will that DHTML find everything that it needs?

Of course there are lots more things that can seemingly go astray that aren’t mentioned here, so how do we solve it? Some hail standards as the way to go. Perhaps they are right, I certainly like standards. But the problem with standards is that they need to cater for every situation; and they need to be created, grow and die gracefully.  Very hard to do well.  Other’s take the minimalist approach claiming that it’s easier to be consistent accross diverse transmission and presentation media by making it all to simple to stuff up.  But as a friend has a habit of saying this is proved wrong by the all too true adage ‘when one makes a thing idiot proof, one will only find a deeper class of idiot’.  It is so easy to disprove the idea that a central control system or monopoly can render things perfect through ultimate control with one word: Microsoft.

Ultimately there’s one thing that we can’t protect ourselves from, idiots who think that this problem can be universally solved.  The problem is not that the situation need be solved but that we need to recognise that it doesn’t need to be solved. You really only want to talk to those of a similar mind or a specific group so just design for them.  People who want to talk to the world fail because the world quite frankly isn’t interested and isn’t listening.  So don’t solve that problem.

Solve the problem of communicating with your target market.  Our only concern should be with our target audience and communication with them.  Smaller number of people means a smaller number of problems and perhaps we can manage that.  So take this web site, its target market is me.  I blog mostly to see my thoughts on the screen because it helps me think about them.  Some times it helps me understand the rubbish that I think.  Other times it helps me develop some really neat ideas and concepts.  If other people read it that’s fine, I might even refer people to it to understand something that I don’t have time to explain but that I’ve written about.  But at the end of the day its just a place for my thoughts and ramblings.  Do the same with your market, are they old people with simple set ups they bought from Wall Mart?  Are they techno-junkies how are likely to have the latest in colour reproducing screens?  Will they actually consume your content how you want them too?

At this point I need to wrap up with a tie to the opening about aesthetics and the variances that make it hard with one last and the most significant point, the people factor.  Every now and then someone says to me, ‘it shouldn’t be black with white ’cause its hard to read’ or ‘if you don’t make it colourful it won’t be a site that people like to spend time on’ or my favourite ‘I’m an artist / designer / architect and I know what looks good and that doesn’t’.  But even the so called professionals are really just espousing their own opinions and preferences unless they can come up with concrete reasons that demonstrates that your target market will be affected by what they have observed.  So take the hints from above and comments of others into consideration when designing your site, but most importantly survey your target market to make sure your web site works for your audience.

All good points, however there is a point that needs to be understood; I’m sad if you don’t like my web site but that’s OK because it works for my audience, me.  Hopefully everybody else has just as much success designing for their audience too.

Best of luck to all of you, M.C.


Do you have a Spring in your Wheel?

Spring is one of the frameworks of the moment but is it really a great framework or something entirely different? Most of the frameworks that I know of are fairly targeted and form a particular function. All of the good ones that I know of have a particular place in life; a frame work for doing some thing. Spring does not. But there’s gold in that there code…

Probably the most disappointing thing about Spring is its documentation. The entry barrier for integrating Spring with the rest of your tool set is quite high by its limited documentation. Yes there is a great volume of the stuff, but it appears to be limited to those who are already in ‘the know’. In other words if you really want to get a handle on Spring from scratch you have two choices: either buy a book or surf the web and hope.

But this is because we’ve been mislead. Struts is a framework, Spring is not. So a conclusive introduction to Struts is easy to write and easy to understand because it frames an idea. However as Spring does not frame a central idea and so writing an introduction to Spring is almost a lost cause from a framework point of view.

Spring is an SDK. A jolly brilliant SDK. When you look at the Spring development kit you find that it will work with what you have, doesn’t force a great monster of a framework on you and it riddled with simply brilliant concepts and patterns. As you start to delve into it you realise that the guys who put it together have brought together a great collection of conceptual gems. If you keep watching the SDK grow you will see more and more of them being added to the SDK as components for use in groups or by themselves.

Underlying all of these gems is a strong theme of patterns and approaches to design that pervades all corners of the SDK making it feel and look like a slick whole. These concepts include such things as the pervasive use of IOC accross the board to the modular adoption of specific patterns like MVC, which IS a framework, that are included in such a way as to let the designer include or exclude them in accordance with usefulness. In most ‘frameworks’ that I’ve seen its either all features or a difficult configuration to remove unwanted bloat.

So is Spring a case against bloat ware? No. Bloat ware is a term given to things that produce copious quantities of bulk which could easily be replaced by more stream lined solutions. When you see a product that makes you think ‘surely that could be done with less’ you’re probably looking at bloat ware. Spring is not there for that purpose. You see almost everything it does belongs to the school of computing that is termed ‘middle ware’. Middle ware is almost by definition bloat.

However tempting to label Spring as bloat ware it does perform two very critical tasks often overlooked by those bloat ware vigilantes out there; it makes your software manageable. Everything that Spring does could be pared back to make it faster more efficient etc but you could barely dent it at the cost of making your software less easy to design, implement, install and operate. Carefully add Spring to your project and you are adding real value in terms of future proofing your product and saving yourself a ton of grief.

So if this is a SDK how does it compare to other SDK’s? To answer this question you need to think about perspective. If you look at the JDK, the standard C/C++ libraries, X windows xt or even Microsoft’s extensive .Net SDK you will find products that are built bottom up. That is products that have been built from a wide array of small useful components into a general smorgasbord of fancy components and knick knacks. Spring is built the other way around, it has been built from concepts to components. So when you look at Spring you see a large number of cohesive modules rather than the gravel and rocks of most other SDK’s. Effectively Spring takes the concepts and makes them real relying on other foundational tool kits to get the simple things in life done. Top down.

So Spring is not a framework, it is not a way of protecting you from bloat ware, it is an implementation of concepts, patterns and design built on the basics for you to cherry pick as you please and improve your product and its operation. Its an SDK like so many others; hardly a reinvention of an old wheel, but a new sporty set of spokes to use in building your own.


Simple PHP MVC

One of the much talked about patterns of the moment is the model-view-controller or model-view-presenter or any of the undoubtedly thousands of variants thereof. Here’s a deliberately really simple one to illustrate the pattern. If you can read it and understand it I hope its useful to you.

Interestingly a vague attempt at inversion of control, another design pattern which IMHO is more of a guide line, is in play here. It is done specifically to allow good component separation and useful test patterns. Hence as long as you have a class that fits the dependency obligations you can use it with the ones given here.

Now for the seriously crazy bit, the code on this page has not been tested to work. Its just here to give you some idea of what the model-view-controller pattern and how it could be implemented.

mvc.controller.php

Overview:

At the heart of each variant implementation of the model view controller pattern is the controller. A controller basically directs requests to the appropriate model or view processing functionality. In the case below it only makes the distinction between the two types and not the instances of functionality.

Some controllers combine a lot more but its not actually necessary and generally ends up making the controller very configuration and platform specific. By keeping configuration out of the controller we make it as general as it can be.

Notes:

Note that there is no exception handling here either. It is not wanted. Consider for a moment that if an exception was not caught, there would be no response to the client, which is a security requirement in many instances. However it there would be a log to trace to the point of failure because of the log entry call.

If exception handling were introduced it would compromise the generic nature of the controller as the controller would need to know what type of response to send to the client. For example if the error was thrown before the content of the request could be analysed and what data would you respond to the request with? HTML? XML? A media stream? A GIF? A JPG? The rule of thumb here is only respond if you know what language to talk in, if you don’t know the requested format you can’t even use clever configuration to determine how respond to a request.

Documented dependencies:

  • class ‘Model’ : method ‘process_request’ : parameters ‘request to process’ : returns ‘new request for either model processing or view generation’
  • class ‘View’ : method ‘process_request’ : parameters ‘request to process’ : returns ‘a view generated from the accumulated data in the request’

Undocumented dependencies:

  • class ‘Log’ : method ‘writestr’ : parameters ‘string to log’, ‘name of log level to record string at’ : returns ‘nothing’
  • class ‘Log’ : method ‘writereq’ : parameters ‘request to log’, ‘name of log level to record string at’ : returns ‘nothing’
  • class ‘Request’ : method ‘is_a_model_request’ : parameters ‘none’ : returns ‘true if the request is to be processed by the model’

Listing:


<?php

Log::getInstance()->writestr( "file:".__FILE__.":loaded", "debug" );

class Controller {

  private $model;

  private $view;

 /*

  * build this controller object

  */

  public function __Construct ( $model, $view ) {

    $this->model = $model;

    $this->view = $view;

  }

 /*

  * process a request

  */

  public function process_request ( $request ) {

    Log::getInstance()->writestr( 

      "class:".__CLASS__.":process_request", "debug" );

    Log::getInstance()->writereq( $request, "debug" );

   /*

    * first process the model request

    */

    $new_request = $this->model->process_request( $request );

   /*

    * if this request results in another model request

    */

    if ( $new_request->is_a_model_request( ) )

      return $this->process_request( $new_request );

   /*

    * otherwise generate view

    */ 

    return $this->view->process_request( $new_request );

  }

}

?>

mvc.model.php

Overview:

A model is responsible for the production of data for the view. Essentially it contains access points to the various methods of the underlying application called actions. In this case it simply has a list of actions that it locates and then executes the action sending the results back to the caller.

Notes:

As with the controller (see above) the model does little more than the basics leaving other classes that know what they are doing to handle the specifics of content. In line with this the model simply logs progress and fails silently.

Undocumented dependencies:

  • class ‘Log’ : method ‘writestr’ : parameters ‘string to log’, ‘name of log level to record string at’ : returns ‘nothing’
  • class ‘Log’ : method ‘writereq’ : parameters ‘request to log’, ‘name of log level to record string at’ : returns ‘nothing’
  • class ‘Actions’ : method ‘find_requested_action’ : parameters ‘request to locate action for’ : returns ‘an Action object’
  • class ‘Action’ : method ‘process_request’ : parameters ‘request to process’ : returns ‘new request for either model processing or view generation’

Listing:


<?php

Log::getInstance()->writestr( "file:".__FILE__.":loaded", "debug" );

class Model {

  private $actions;

 /*

  * build this model object

  */

  public function __Construct ( $actions ) {

    $this->actions = $actions;

  }

 /*

  * process a request

  */

  public function process_request ( $request ) {

    Log::getInstance()->writestr( 

      "class:".__CLASS__.":process_request", "debug" );

    Log::getInstance()->writereq( $request, "debug" );

   /*

    * find the action from the list of actions

    */

    $action = $this->actions->find_requested_action( $request );

   /*

    * run the action and return the resulting request

    */

    return $action->process_request( $request );

  }

}

?>

mvc.view.php

Overview:

A view is called to format the data retrieved from the model into the format that the original client requested. Essentially it contains access points to the various view methods of the underlying application. In this case it simply has a list of views that it locates and then executes sending the results back to the caller.

Notes:

Now those sharp of eye will realise something; this looks pretty much like the model class above, and you’d be right. Apart from a a simple renaming of a few items the code is the same. In practice both the above model class and this view class would be implemented as a single class and simply configured differently. They are only provided in their separate formats here to highlight the model-view-controller form.

Undocumented dependencies:

  • class ‘Log’ : method ‘writestr’ : parameters ‘string to log’, ‘name of log level to record string at’ : returns ‘nothing’
  • class ‘Log’ : method ‘writereq’ : parameters ‘request to log’, ‘name of log level to record string at’ : returns ‘nothing’
  • class ‘Views’ : method ‘find_requested_view’ : parameters ‘request to locate view for’ : returns ‘an View object’
  • class ‘View’ : method ‘process_request’ : parameters ‘request to process’ : returns ‘the final generated view’

Listing:


<?php

Log::getInstance()->writestr( "file:".__FILE__.":loaded", "debug" );

class View {

  private $views;

 /*

  * build this view object

  */

  public function __Construct ( $views ) {

    $this->views = $views;

  }

 /*

  * process a request

  */

  public function process_request ( $request ) {

    Log::getInstance()->writestr( 

      "class:".__CLASS__.":process_request", "debug" );

    Log::getInstance()->writereq( $request, "debug" );

   /*

    * find the view from the list of views

    */

    $view = $this->views->find_requested_view( $request );

   /*

    * process the request and return the resulting view

    */

    return $view->process_request( $request );

  }

}

?>

LDAP vs RDBMS is the war over?

Perhaps the question is better put, ‘Did the war even happen?’ What war you might well ask; well when different ways of looking at potentially the same data are viewed then there are bound to be zealots on either side fanning the flames of FUD on the opposition and extolling the vitues of their chosen creed. However although both RDBMS and LDAP are commonly used to host the same data for applications there appears to be little controversy.

Mostly, I think, there is no conflict because in general LDAP is not even an option for the vast masses of application developers out there. Especially as the bulk of application development appears to be done facing the Internet. Normally hosting environments provide a choice of RDBMS or RDBMS. LDAP is just too difficult a concept, and that’s probably the first reason that it has fallen out of contention for storing the masses of organisational data out there. Yet it is probably the best candidate that I know of for holding operational control of that data.

To illustrate this conceptual difficulty lets consider the contention between software developer and database architecht perspectives. Typically the developer does not really understand the various tools that the RDBMS offers to manage data. The database expert loaths the way that developers start implementing relational contraints in code instead of applying the appropriate design concepts of foriegn keys, triggers, stored procedures etc. I’ve seen this many times. Often I’ve seen the opposite where database folk have decided to embed code into their domain. To really get this right each party needs to understand the value that the other brings to the table. But I continually read laments from either side criticising the way that this or that should really be done over here and not there.

Which brings us into LDAP. This underused gem is typically used by the development community simply as a convienient pre-built authentication tool. Even when LDAP is used a lot of information is then stored in an RDBMS that it really does not do well; or at least as well as LDAP. I’ll spare you the history of why this is so and how it came to be but the fundamental difference between the two is that RDBMS generally is a collection of flat file tables that are related by loose rules; whereas the LDAP server is a tightly coupled hierarchy of objects (called the Directory Information Tree – DIT) similar in nature to that of other concepts like XML.

LDAP servers are generally built for high speed retrieval and mass replication focussed on the enterprise, where RDBMS is a general ledger store house, and pretty primative too. The kind of data then that you should store in the LDAP directory is structural data. Like say configuration data for a telephone system or contact information for a customer relationship management (CRM) type of application. Even the configuration and preferences information of WordPress for this blog should really be stored in an LDAP server. But its not.

If you want to know if the data that you are looking at is suited to LDAP then consider.

  • Is the data dynamic or relatively static?
  • Does the data need to be distributed?
  • Can the data be used by more than one application?
  • Is the data multi-valued?
  • Can your data or application take advantage of a hierarchical relationship?
  • Do you need flexible security options?
  • Do you need single sign-on?
  • Do you need distributed or delegated administration capabilities?

If you can answer yes to some or all of these questions, then directories and directory-based applications would likely be useful to your application or project.

So why did I write this post? Basically it is a pointer into the world of LDAP for those who’ve not thought of it. A start. If you really want a better introduction go here. Have fun.


Driven to Reinvent

Its nice to have a standards based platform and in that vein C certainly has progressed a long way. But you still get the odd system that isn’t either set correctly up by default with my favorite extensions; namely all things GNU! Yes I admit to liking non-standard well coded extensions, in fact who would work without them these days except under a very specific set of requirements or academic interest. Things just would not get done so quickly without them.

So I got a Mac. Yep its sooo cool that when my 5 year old daughter walked in the room and saw it for the first time she exclaimed ‘Daddy! That’s sooo cool!’ Kind of an ego boost for someone who gave up on cool and cool things 20 years ago. However cool comes at a price, its got great development tools but I want cross platform development to run on my Linux boxen too. Long story short, it doesn’t have the standard GNU extensions so I had two choices, fart around with the environment or rewrite one or two functions myself.

So here’s the goods. Now because I don’t want people to just copy and paste my material without thinking about it (I don’t mind the copying I just like people to put in some effort and think) I’ve copied an earlier version of the functions which works for some situations but not for others. Its close tot he real deal but there are about 5 significant issues which would make it dangerous to use this code without doing something about them.

Here it is, have fun, have a play! (Note: the whole thing is under GPL 3, comments from GNU)


/*

ssize_t getdelim (char **lineptr, size_t *n, int delimiter, FILE *stream)  

This function is like getline except that the character which tells it to stop 

reading is not necessarily newline. The argument delimiter specifies the 

delimiter character; getdelim keeps reading until it sees that character (or end 

of file).

The text is stored in lineptr, including the delimiter character and a 

terminating null. Like getline, getdelim makes lineptr bigger if it isn't big 

enough.

getline is in fact implemented in terms of getdelim

*/

ssize_t getdelim( char **lineptr, size_t *n, int delimiter, FILE *stream ) 

{

	/* setup the environment */

	int c = getc( stream );

	long i = 0;

	char *a = NULL;

	/* count the characters needed for the buffer */

	while ( c != delimiter && c != EOF ) {

		c = getc( stream );

		i++;

	}

	/* test for and arrange the buffer size */

	if ( i == 0 ) 

		return -1;

	if ( fseek( stream, -i, SEEK_CUR ) ) 

		return -2;

	if ( *n < i + 1 ) {

		a = ( char * )realloc( *lineptr, i + 1 );

		if ( a == NULL )

			return -3;

		*lineptr = a;

		*n = i;

	}

	/* read the data into the buffer */

	if ( fread( *lineptr, sizeof( char ), i, stream ) != i )

		return -4;

	( *lineptr )[i] = 0;

	/* return the number of chars read */

	return i;

}

/*

ssize_t getline (char **lineptr, size_t *n, FILE *stream)

This function reads an entire line from stream, storing the text (including the 

newline and a terminating null character) in a buffer and storing the buffer 

address in *lineptr.

Before calling getline, you should place in *lineptr the address of a buffer *n 

bytes long, allocated with malloc. If this buffer is long enough to hold the 

line, getline stores the line in this buffer. Otherwise, getline makes the 

buffer bigger using realloc, storing the new buffer address back in *lineptr and 

the increased size back in *n. See Unconstrained Allocation.

If you set *lineptr to a null pointer, and *n to zero, before the call, then 

getline allocates the initial buffer for you by calling malloc.

In either case, when getline returns, *lineptr is a char * which points to the 

text of the line.

When getline is successful, it returns the number of characters read (including 

the newline, but not including the terminating null). This value enables you to 

distinguish null characters that are part of the line from the null character 

inserted as a terminator.

This function is a GNU extension, but it is the recommended way to read lines 

from a stream. The alternative standard functions are unreliable.

If an error occurs or end of file is reached without any bytes read, getline 

returns -1. 

*/

ssize_t getline (char **lineptr, size_t *n, FILE *stream)

{

	return getdelim ( lineptr, n, '\n', stream );

}

Size Matters – the Quest for the ideal MTU

One of the fundamental aspects of communication is the amount of information that can be communicated in one chunk before allowing others to communicate through the same channel; the maximum transmission unit (MTU).

In computer communications the MTU is the amount of data transportable across the lowest layer of the communication stack; Ethernet, FDDI, etc. From the base MTU each layer above in the networking stack breaks the data that it has to send into packets that will fit in the next layer down. For TCP/IP over Ethernet the base Ethernet specification includes a MTU of 1500 bytes (octets in the IETF RFC) not including the Ethernet header and the TCP/IP layers expand their default packet size from 536 and 576 bytes respectively to 1460 and 1500. Effectively the MTU sets the packet size of higher protocols in the OSI model.

Networks are often unable to transport a maximum sized packet. In which case the packets are fragmented either by the sender and receiver or some device(s) in the network in between. There are many reasons for this, for example:

  • Change in base network media i.e. Ethernet to FDDI conversion.
  • Poor line quality.
  • Protocol tunnelling consuming part of the transmission with embedded headers etc.
  • Network device capacity.
  • Data transmission streams with conflicting efficient packet size efficiencies i.e. FTP and VOIP.

Using the standard MTU dynamic shaping protocols normally gets around the these transmission problems. If a device can’t handle a delivered MTU it responds with an ICMP message that informs the sender and advertises its acceptable MTU. However it doesn’t always work and can produce various effects from black hole routing, to delayed responses and network application transmission jitter (very annoying in media streaming applications). Some of the reasons for this are hardend or unsophisticated systems that do not support dynamic MTU discovery or IPsec bridges or other tunnels improperly configured.

To remedy issues it is best to replace the networking infrastructure that is causing the issue with components that are well behaved and can support dynamic MTU. Some times however this is not possible so you have two options, configure a gateway to the device to clamp the MTU to an acceptable size (which you have already found out) or manually configure the sending devices to have an appropriate MTU for the affected route. If you have Windows devices then your best option is to use an existing or introduce a new Unix/Linux based gateway to clamp for you.

Linux (specifically Redhat/Fedora distributions) provides many facilities for the manipulation of MTU. The tools include the ‘/proc’ system components and various applications from the nettools package as well as our old friend ifconfig. So the facilities that I generally use at the start are:

  • ifconfig – to view the MTU and other network information.
  • tcpdump – to look for MTU dynamic shaping ICMP responses.
  • ethtool – to set the MTU and other network card options.
  • route – for fixed MTU setting based on route.
  • ‘/proc/sys/net/ipv4/ip_no_pmtu_disc’ – for fixing the MTU for testing.

ifconfig is the first place to start, it will tell you without any fuss what your network card is set to. A simple ifconfig on the command line will give you something that looks like this:

[root@bluetop ipv4]# ifconfig
 eth0      Link encap:Ethernet  HWaddr 00:0B:DB:19:46:4D
 inet addr:10.0.0.176  Bcast:10.0.0.255  Mask:255.255.255.0
 UP BROADCAST MULTICAST  MTU:1500  Metric:1
 RX packets:0 errors:0 dropped:0 overruns:0 frame:0
 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
 collisions:0 txqueuelen:1000
 RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
 Interrupt:10
lo        Link encap:Local Loopback
 inet addr:127.0.0.1  Mask:255.0.0.0
 inet6 addr: ::1/128 Scope:Host
 UP LOOPBACK RUNNING  MTU:16436  Metric:1
 RX packets:6178 errors:0 dropped:0 overruns:0 frame:0
 TX packets:6178 errors:0 dropped:0 overruns:0 carrier:0
 collisions:0 txqueuelen:0
 RX bytes:309060 (301.8 KiB)  TX bytes:309060 (301.8 KiB)
wlan0     Link encap:Ethernet  HWaddr 00:11:50:FD:7B:65
 inet addr:192.168.2.98  Bcast:192.168.2.255  Mask:255.255.255.0
 inet6 addr: fe80::211:50ff:fefd:7b65/64 Scope:Link
 UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
 RX packets:4126 errors:0 dropped:0 overruns:0 frame:0
 TX packets:2447 errors:0 dropped:0 overruns:0 carrier:0
 collisions:0 txqueuelen:1000
 RX bytes:4505043 (4.2 MiB)  TX bytes:393742 (384.5 KiB)

From this you can see that I’ve three interfaces running and eth0 is not being used. However note that the MTU value of each interface is shown here. This is the MTU for the device but not the path. Note also that the MTU for a particular path may be much smaller. In Fedora this value can be set a number of ways but the two simplest are the GUI network tool, which is very simple just look at it as it doesn’t need to be discussed here discussed here, and once again ifconfig:

[root@bluetop ipv4]# ifconfig lo
lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:16430  Metric:1
          RX packets:6417 errors:0 dropped:0 overruns:0 frame:0
          TX packets:6417 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:336104 (328.2 KiB)  TX bytes:336104 (328.2 KiB)
[root@bluetop ipv4]# ifconfig lo mtu 16436
[root@bluetop ipv4]# ifconfig lo
lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:6417 errors:0 dropped:0 overruns:0 frame:0
          TX packets:6417 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:336104 (328.2 KiB)  TX bytes:336104 (328.2 KiB)

In the example above the loopback device has had its MTU set to 16436 from 16430. If we echo 1 to ‘/proc/sys/net/ipv4/ip_no_pmtu_disc’ then MTU dynamic shaping will be turned off and the machine will not send data packets less than the maximum if it has data to fill it. This is some times useful when trying to simulate devices on a network that do not have dynamic MTU built in or disabled.

ip is another tool that can be used for manipulating the MTU. But instead of altering the MTU for the whole device it can be targeted to a specific network. This is especially useful when a network device is shared between different networks. This tool can do all sorts of things so its a good place to start when considering things like tunnelling and complex routing. But here we’ll stick to setting MTU’s on specific routes. For example lets say that we’ve discovered that the network 10.0.0.0 really needs to have an MTU of 1400 but its behind our default gateway and its just not going to work out well for all of the other networks if we clamp to an MTU of 1400 for everyone. So here’s the answer:

[root@bluetop ipv4]# route -ee
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface    MSS   Window irtt
192.168.2.0     *               255.255.255.0   U     0      0        0 wlan0    0     0      0
default         192.168.2.1     0.0.0.0         UG    0      0        0 wlan0    0     0      0
[root@bluetop ipv4]# ip route add 10.0.0.0/24 via 192.168.2.1 mtu 1400
[root@bluetop ipv4]# ip route show
10.0.0.0/24 via 192.168.2.1 dev wlan0  mtu 1400
192.168.2.0/24 dev wlan0  proto kernel  scope link  src 192.168.2.98
default via 192.168.2.1 dev wlan0  proto static
[root@bluetop ipv4]# route -ee
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface    MSS   Window irtt
10.0.0.0        192.168.2.1     255.255.255.0   UG    0      0        0 wlan0    0     0      0
192.168.2.0     *               255.255.255.0   U     0      0        0 wlan0    0     0      0
default         192.168.2.1     0.0.0.0         UG    0      0        0 wlan0    0     0      0

Neat? Well this can be surprisingly useful in many places. In fact the inverse of the previous example is uncommonly useful when dealing with an artificially clamped default path. Consider your ADSL connection which probably uses PPoE (Point to Point over Ethernet) tunnelling. Your local machine undoubtedly uses an Ethernet MTU of 1500 and the ADSL device probably does on the connection to your network provider (which is invariably different from your ISP). However PPoE consumes a little of that 1500 bytes to pack in its tunnelling protocol. This means that on large data transfers your machine is pumping out 1500 bytes of TCP/IP to your ADSL connection and that device is quite likely fragmenting those packets into 2 so that it can get its tunnelling protocol messages into the same packets. Effectively you are adding a small but significant overhead to the whole data transfer process increasing the consumption of your ISP data cap and causing your ADSL device to work harder than required. Yes you are in fact solely responsible for global warming. You crim!

Often in professional networks where PPoE tunnels are used you will see various counter measures to avoid the side effects of the tunnelling overhead. One way is to configure jumbo (oversized MTU) packets on the tunnelling devices. However that really requires you to configure both ends and its not really normal for you to be able to configure your network supplier’s routers. At least not legally. However you can clamp your default routes from your devices and set your local LAN MTU to, well, whatever you want. That way you don’t affect your local LAN performance and you can improve your gateway throughput.

Naturally this is reasonably dangerous so I’m just going to give you the command and let you break your network yourself. I guarantee that you will stuff this up so have a play when you don’t mind spending the time on it; and by the way this sort of stupidity can raise the temperature on your physical devices: so you can actually physically damage your equipment or alter its life expectancy this way. BE WARNED you own ALL of the risk on this one. But do have fun!

ip route add default via 10.0.0.1 mtu 1470

Rewriting the Wheel: bin2hex – hex2bin

Yep everybody does it; repetition.  In the grand old days of old we all rebuilt the wheel leading to the establishment of patterns.  For me patterns often help me get to achieve my goals faster than using Google.  Using Google to find simple tools normally means understanding the search engine, browsing the results and evaluating each alternative in turn.  Invariably this means slightly altering someone else’s code, compromising objectives or installing a bulky tool that is bloated by a thousand functions you don’t need or want.  Naturally this last list is not exhaustive; just consider the dependencies that an external solution often requires.  If its simple and quick its often better to reinvent your wheel.

A perennial favuorite of mine is base translation; specifically hexadecimal to binary and binary to hexidecimal.  This is so simple and so useful it’s the subject of thousands of Google results.  Yet the useful bits are rarely on the first page of results.  So I coded it in about 30 minutes, and debugged it in about 30 minutes – because I was tired at the time.  Interestingly this is not the first time that I’ve written this code.  My earliest recollection of having written it was about 1986 over 22 years ago.  Of course it was coded in basic at the time on an Atari, however it wasn’t too long before I started translating it into C.

Patterns have long been a formal word in Computer Science.  In 1996 a very famous book, the name of which escapes me now and I’m too lazy to get it off my shelf, was dedicated to the development of patterns in software development.  Interestingly the way that the patterns were described was almost a deterent to the patterns themselves.  Being of academic nature the usefulness of the book was somewhat tarnished by its lack of pragmatic style and appeal to the great unwashed.  Its a fantastic book though which every software engineer should read.

For me patterns are best expressed in real world examples and in the languages that I understand.  It makes them instantly availble to me and much more likely to improve my work.  Yes they do need some formal definition most of the time.  However if its so simple as to be obvious then don’t clutter the example with an explanation other than to say what you used it for.  If its really that useful to be made into a pattern it’ll resurface again in the form of ‘now what did I do with that code I wrote 22 years ago that solved this problem?’

Well here it is, I used it to translate a Hexadecimal network packet lifted from an application log back into binary so that it could be replayed over a network for testing the application time and again.  Note the use of redirected standard input and output; its clean – no Swiss army knife in this one, what would I need a spoon on it for anyway?

hex2bin.c

#include "stdio.h"

#include "stdlib.h"

int main( int argc, char ** argv, char ** env ) {

  char h = '0';
  FILE * fi = stdin;
  FILE * fo = stdout;

  if ( argc > 1 ) {

    printf( "Usage: redirect input and output to stdin and stdout respectively.\n" );
    return 0;
  }

  h = getc( fi );

  while ( h != EOF ) {

    char b = 0;

    if ( h - '0' < 10 ) b = h - '0';
    else if ( h - 'a' < 'g' - 'a' ) b = ( h - 'a' ) + 10;
    else if ( h - 'A' < 'G' - 'A' ) b = ( h - 'A' ) + 10;

    b = b << 4;

    h = getc( fi );
    if ( h == EOF ) return 0;

    if ( h - '0' < 10 ) b += h - '0';
    else if ( h - 'a' < 'g' - 'a' ) b += ( h - 'a' ) + 10;
    else if ( h - 'A' < 'G' - 'A' ) b += ( h - 'A' ) + 10;

    if ( putc( b, fo ) == EOF ) return 0;

    h = getc( fi );
  }

  return 0;

}

Of course in this case it was just too tempting to write it’s sister as well: binary to hex. If you combine these two tools on the command line with netcat and some other favourite script favourites you can make quite a useful test tool.

bin2hex.c

#include "stdio.h"

#include "stdlib.h"

int main( int argc, char ** argv, char ** env ) {

  char b = '0';
  FILE * fi = stdin;
  FILE * fo = stdout;

  if ( argc > 1 ) {

    printf( "Usage: redirect input and output to stdin and stdout respectively.\n" );
    return 0;
  }

  b = getc( fi );

  while ( b != EOF ) {

    char h = 0;

    h = b >> 4;
    if ( h < 10 ) h = h + '0';
    else if ( h < 16 ) h = h + 'a';
    if ( putc( h, fo ) == EOF ) return 0;

    h = b & 0x0f;
    if ( h < 10 ) h = h + '0';
    else if ( h < 16 ) h = h + 'a';
    if ( putc( h, fo ) == EOF ) return 0;

    b = getc( fi );
  }

  return 0;

}

Lastly there is a point that is quite good to note about such simple patterns: they make really a really good basis for recruitment tests. You can always tell how good an organisation is in recruitment by the quality of this process. If they complain about spelling it generally means that the organisation is focussed on minutia and micro-management is probably the order of the day. Normally that’s a warning sign. So I always include spelling mistakes in my submissions. However if you are asked about the various ways of implementing the illustrated code (once of course you’ve rewritten it for them from requirements!) and integration strategies, impact on speed, memory usage along with testing and comparison of the requirements provided you’re probably to a good thing.

So just for fun let’s look at one aspect of this. IF you go do that dreaded Google search and sift through the cruft out there you will find a very specific approach often used that is quite different to that which I’ve used above to determine binary or hex value in translation; it normally involves using a switch statement like this:

char h;
int b;

switch ( h ) {
  '0'  :  b = 0;
          break;
  '1'  :  b = 1;
          break;
  '2'  :  b = 2;
          break;
  '3'  :  b = 3;
          break;
  '4'  :  b = 4;
          break;
  '5'  :  b = 5;
          break;
  '6'  :  b = 6;
          break;
  '7'  :  b = 7;
          break;
  '8'  :  b = 8;
          break;
  '9'  :  b = 9;
          break;
  'A'  :
  'a'  :  b = 10;
          break;
  'B'  :
  'b'  :  b = 11;
          break;
  'C'  :
  'c'  :  b = 12;
          break;
  'D'  :
  'd'  :  b = 13;
          break;
  'E'  :
  'e'  :  b = 14;
          break;
  'F'  :
  'f'  :  b = 15;
          break;
  default : return -1;
}

All this seems reasonable. Yes its going to be bigger than my code but its going to be quicker right? Actually that’s not right. You see the code is larger so its going to take a larger number of get requests to the memory. My code is small enough to fit inside most modern processor’s register sets not even touching the processor cache which the static example just given would have to sit in. Next the compiler on average will make twice the number of comparisons to jump into the switch statement, than the corresponding number of operations for my version. But this is also compiler and processor dependent. Coding simply in this case makes the optimiser’s job easier but an optimiser normally doesn’t have the nous to translate the code to another approach, just apply obvious short cuts to the code already presented. Naturally this could lead into discussions about optimisers and their abilities and limitations.

In fact there are a large number of optimisations that could be made to my code that would improve its performance too. But I’ll stop here because I could write a book on this one piece of code. So you see a particularly innocuous, been-done-before, simple piece of code can be interesting and be used to draw out the depth of a person’s knowledge. Plus there’s a certain amount of satisfaction in being able to go back an polish old code, it’s like visiting an old friend.


Random Passwords

Most of the time the best way of generating passwords is by using ‘passphrases’ which you can remember. But in this day and age of having a unique password for everything this approach is not always practical. For most users the temptation to generate a really secure password and then use it everywhere is just too much. Its not necessarily the remembering aspect of the equation either; it can take a lot just to generate a good secure password in the first place. Especially when each site that you visit or application that you use has similar but crucially differing requirements. For me the best way to store multiple passwords is in a secured store using a regularly rolled key, in my case PGP encrypted files.

To easily generate a random password in Linux all you need to do is:

  'dd if=/dev/urandom count=1 4> /dev/null | uuencode -m - \
  | sed -ne 2p | cut -c-16'

Note that this does rely on you having /dev/urandom, uuencode, sed and cut available on your system. Which in my case generated the following output:

  [user@blackbox ~]$ dd if=/dev/urandom count=1 4> /dev/null \
  | uuencode -m - | sed -ne 2p | cut -c-16
  1+0 records in
  1+0 records out
  512 bytes (512 B) copied, 0.000223 seconds, 2.3 MB/s
  fQ6XFnhsNWbeMtph

The password generated in this case is of course ‘fQ6XFnhsNWbeMtph’, a pain to remember for sure but very secure. A slight alteration allows for a number of passwords to be generated:

  'for ((n=0;n<10;n++)); do head -c16 /dev/urandom | uuencode -m - \
  | sed -ne 2p | cut -c-8; done'

Lists of passwords are good in the odd case where the generated password may not contain quite the flavour of included characters that you were looking for. They’re often good too when you need to roll a bunch of passwords at once or if you’re setting up a number of systems at the same time. Note the alteration in password length as set by the cut tool. There’s also a change to using head to read the random data too. Here’s an example of a set of passwords generated by the command above:

  [user@blackbox ~]$ for ((n=0;n<10;n++)); do head -c16 /dev/urandom \
  | uuencode -m -| sed -ne 2p | cut -c-8; done
  x41jVgKc
  vC+IxVLU
  xkzOfyVu
  5WwEWEat
  Ymw4C52m
  BQ5Gtcjj
  ByRqTEY
  CO79z599
  VJlcIzU7
  3mJ1F3b8

Apart from this there are a number of other things that you can do with the passwords; but these have mostly to do with the characters generated. So for example if you wanted to generate 16 digit pins you could use some variant of the following:

  head -c16 /dev/urandom | od -t u8 | awk '{ print $2 }' | cut -c-16

Naturally you could add in a grep statement to the earlier commands to do something similar by capturing only numeric characters. But such a method is statistically inefficient due to the small number of digits in the earlier streams. Other cases, for example removing the non-alpha-numeric characters could be better suited for grep filtering. Quite naturally I’ve left this as an exercise for the reader as I don’t need that right now. Here’s an example execution of the pin generation command:

  [user@blackbox ~]$ head -c16 /dev/urandom | od -t u8 \
  | awk '{ print $2 }' | cut -c-16
  9815394141245590

One significant observation is that there are many statements to be found that are critical of using a random number generator to generate password data. Normally the arguments are based around the fact that automata are easily replicated and hence the passwords generated may be weak. The flaw in this argument is that the methods shown here are not intended to be infallible, just accountably strong, and in general passwords are always fallible. It’s just a question of educated guessing. However one last word, always be sure that your passwords are vetted in sample against some standard of strength and protect your generation mechanism.

Its a dangerous world out there, take care of your passwords with good policy and procedure.


Squid Grep

Just a wee note to remind myself of the best way to view your squid.conf file if you’re in a hurry:

'cat squid.conf | grep -v ^# | grep -v ^$'

Its interesting to note that many people recommend stripping the comments from the active squid.conf with a command similar to this:

'cat /etc/squid/squid.conf | tee /etc/squid/squid.conf.commented \
 | grep -v ^# | grep -v ^$ > /etc/squid/squid.conf.nocomment'

Naturally you would normally not have ‘.nocomment’ on the active file.  You need to be root to do this too.


A Few Ajax Links

AJAX is a now a well established technology. It has been around for a long time but only recently (2005) was the term ‘AJAX’ coined to mean Asynchronous JavaScript And XML and unify the previously disjointed technologies it represents. Essentially it is an amalgamation of technologies to provide good user interface tools for Web browser applications.

Now of course there are a myriad of these toolkits, most born soon after the publishing to the AJAX article in 2005, and most not really suitable IMHO for real application. But they are all useful to look at. For me the most interesting ones are the following:

There is pretty good AJAX web site here if you need a place for starting to look for AJAX resources.


Copyright © 1996-2010 Code Snips. All rights reserved.
iDream theme by Templates Next | Powered by WordPress