The handling of numbers and the continuous advancements in software engineering


When I started programming in 1986, storage wasn’t a problem for me. I had a computer with a huge amount of memory – 1MB and two 720KB floopy-disk drives. The real problem was the programming languages. The Basic interpreter that shipped with my Atari ST was buggy and the Megamax C-compiler would cost a little over 1.000 D-Mark. So I bought an Assembler for a tenth of that price.

I never got into assembly language deeply, but at least I learned a few things about the representation of numbers.

For quite a while it was pretty clear, that a number was a number was a number and it was clear, that the true nature of this thing wouldn’t change.¬† Of course assembly language was somewhat demanding, but soon I shifted to Forth83, which was much more convenient. There was no question for numbers. A number was a number and a character was an interpretation of a Byte but it was still a number.

This situation lasted for many years. Later I installed a Minix on that Atari ST which finally provided me with a C-compiler. When I got my first UNIX-Workstation (a NeXT-station) in 1991 it came with a GNU C-compiler. And there I had my first contact with sockets when I dug into an example that connected with Mathematica through Sockets.

It took a long time until I really started messing around with the related problems. During that time, I programmed, because it was fun and I had fun learning these things and making things work. In a way, I still have fun with exactly this.

For the time being I didn’t do a lot with sockets. I came into databases and to me it was clear, that I would declare an integer value as an integer and that I wouldn’t use a float to represent money values and things like that. Well, in the early nineties it wasn’t clear to everybody that money needed representation and handling different from floating point values.

At this time I worked on a single CPU Motorola MVME188, an 88k SVR3.2 UNIX machine with 32MB RAM and 6GB disk space. The whole department was accessing this server using X-Terminals. I thought twice before I wasted a byte, so I even computed the amount of data, I shoveled through the buses of this computer system. Hence I never had the idea to store an a number as a character string. Handling a 32-bit number as a number consumed 32 bit – handling it as a character string would have consumed, depending on how generous you are, up to 88 bit.

The difference in storage requirement of integers of different sizes shifts between factor 3 for short-int and 2.62 for unsigned 64-bit.

But this is not the only bill to pay when choosing a character representation for numbers. You’ll end up with continuously converting numbers back and forth. We use numbers to compute with them, so whenever we pick numbers from storage, we not only need to squeeze them through the cables, we also need to convert them, before we can perform any computations. And this costs a lot of time! Time, I didn’t have with my small amounts of data on that small UNIX-Servers and time you don’t have with your huge amounts of data on your somewhat bigger servers.

But many developers had not only this convenience problem. While was converting between host- and network-byte-order and started to use XDR, many more talented developers simply transferred their numbers as character-strings, thus saving themselves from considering the byte-order issues.

Well, that was a really great advancement, because now they not only “a little” more space and the computing resources for converting the stuff, often enough they now also needed more IP-packets to transport their stuff over the network and this caused more workload for the operating systems, because the 1024 bytes payload of a standard MTU no longer was enough. So the operating system now had more work to untangle the numerous IP-packets and to create a consistent piece of data from the chunks it received in its network-stub.

Once I’ve seen a piece of software, (it was in the telecoms branch) where they were they even went so far to do many of the necessary computations with a “SELECT FROM DUAL…….” in their oracle database, while coding the rest in C.

You don’t need to go far, to meet programmers giving their worst to make their computers slow.

At the same time, I developed a piece of software that squeezed several data-records into one single IP-packet. I just built an array which kept data according to its nature and took a chunk of 1024 bytes to copy it into one transfer buffer. This was horribly fast compared to what happened otherwise in this telecoms company.

But this was only the very beginning. Already at that time I complained about the waste of computing resources, but the worst was still to come: SOA

While I was creating several services in ONC-RPC and according libraries that hid away the networking stuff from the application programmers, XML and SOA were making a big assault on computing resources. Nobody was interested in fast and efficient implementations – they just bought bigger computers, faster networks and bigger storage installations.

With XML and SOA now a number no longer was only represented as a character string, now a complete XML-file was used to store a number e.g. a patient’s weight or temperature. Now you not only needed a strtol() conversion to make it a number of a sprintf() to make the number a character-string. No, now you need a XSLT-transformation to get the number out of that XML-file and only then you can transform it into an integer to be able to do some computations.

More talented software engineers just store the XML-file in the database as is, which they deem much more efficient.