Monthly Archives: December 2006

How null breaks polymorphism: or the problem with null: part 2

If you haven’t read part one you really need to do that first. It’s here.

I got a lot of interesting feedback on part 1 of this topic and found I needed to further explain myself in certain areas.

Two initial responses to issues brought to me from part 1:

1. Typed languages

In languages that aren’t typed at all, null is no more a problem than any other reassignment as they never care about types. They all can result in similar issues. Which in my mind is in fact a problem. I am definitely a huge proponent compile time type checking. This is why perhaps I seem so hard on it in part 1. Compile time checks should be able to help out.

2. When functions might not return a value

There are plenty of times when a function may or may not return a value. If this is the case the return type should reflect that. Having null used as a “magic number” is not the ideal solution in my mind. I’d much rather see a return type that forces the issue to be very clear. It may seem non-standard but a type masquerading as a collection that potentially has one item seems ideal. Most any programmer will realize he needs to check the size of a collection before trying to access its first element. This is no more cumbersome than checking for null – but seems logically intuitive. This can be easily optimized so that there is no performance hit in a language like C++ using templates. A container is intuitive as it can be potentially empty – this very straightforward.

Making the problem even more clear:

I often find it easier to express programming concepts in real world terms. This helps to reduce them to the absurd when appropriate. This doesn’t always work but a lot of times can help look at the problem from a different perspective.

Let’s take the concept of typical null check behavior and attempt to map it to a real world procedure.

We are going to explore John teaching Norm to drive a car. The following may sound a little familiar.

John: “First you are going to need to make sure you have a car. Do you have a car? If you don’t have a car just return to what you were doing. I won’t be able to teach you to drive a car.

“Norm: “I have a car.”

John: “Let’s call that car normsCar. Check to see if your car has a door. We’ll call it normsCarDoor. Does normsCar have a normsCarDoor?”

Norm: “Yes”

John: “Great, but if you don’t just skip the rest of this – I won’t be able to tell you how to drive a car.”

Norm: “I have a car door.”

John: “Once you open the door check and see if it has a seat we’ll call normsComfySeat. If you don’t have a seat skip the rest of this – I won’t be able to tell you how to drive a car.”

Norm: “I have a seat”

John: “Things have changed scope a bit – can you check whether you have a door again for me – we called it normsCarDoor?”………..

I think you can see where this is going. Classes have contracts. It should be reasonable to talk about a car that always has a door and a seat without having to neurotically check at all times.

Unlike the real world, when we code we make new things up all the time. So even though seats and doors can be reasonably inferred on cars the things that a UserAgentControl may or may not have probably won’t be obvious. Does the UserAgentControl have a ThrottleManager all the time? If I have a non-nullable class type I can check by looking at the class. If I get it wrong for some reason maybe I can have the compiler issue a warning. Maybe I can be forced to “conjugate” it with a special syntax every time I use it to help me remember (“.” vs. “->” in C++). Or a naming convention.

Why is this such an overarching problem?

It may seem like griping over a simple check for a things existence. Except this check can conceivably apply to each and every object I can have (or don’t have). This makes the issue enormous. A lot of the time programmers put in checks for null when it seems appropriate or when unit testing throws an exception at run-time. This sounds a whole lot like something statically typed languages are supposed to help us protect ourselves against. The unchecked null is by far the most common runtime error and the “overchecked” null one of the most prevalent unintentional code obfuscation techniques.

Solutions that don’t involve changing existing languages:

  1. If your language supports the concept out of the box (C++, Spec#, F#, OCaml, Nice, etc.) or it can be built with templates or other mechanisms use not-nullable types. Use these types whenever possible. If you’re language doesn’t support it then use number 2 alone.
  2. Create a simple naming convention (don’t go hungarian on me) to discern what is nullable and what isn’t. This is a fundamental concept and it should be obvious every time a value, instance or object is accessed. Use this to compensate for your languages deficiency in a similar manner to prefixing private variables in languages that don’t have mechanisms for hiding access to variables and member functions.
  3. Check for null as early as possible in the food chain and prefer to call methods that are granular and don’t have to check themselves. This means make every parameter passed to a routine get used. If a parameter is optional write a new routine. Make the types passed to these routines be not-nullable if the language supports it.
  4. If it makes sense prefer empty collections to collections containing nothing but null for obvious reasons.



There is no perfect solution. But so many times in code I see null being checked or not checked and am left wondering. Is the check gratuitous? Is this a runtime error waiting in the wings? And unless it’s commented it or I check it’s usage – I get that typically uneasy feeling. Bound to catch it at runtime with unit testing. This is not the answer I want to hear.

I think a few simple practices and conventions could get this off our plate so we can get back to the problem at hand. Solving actual problems.


Some other articles that beat me to the punch on some of these concepts:

  1. Let’s Reconsider That Offensive Coding from Michael Feathers.
  2. Null:The Runtime Error Generator by Blaine Buxton

Empty containers solve this problem nicely for return types. Also as many readers have mentioned, monads are extremely useful in solving this problem as well. But once again, stop propagating unneeded nulls and similar “safer null-like” structures as soon as possible. As Michael Feathers states – it’s just offensive code.

24 years of game programming: thrills, chills and spills: part 1

Originally I was thinking about calling this article 24 years of game programming: observations and lessons learned. Somehow the title seemed much too stuffy for an entry about something I’ve loved so much over the years.

In this three part series I trace my 24 years of game programming. From machine language in 1982, Pascal in 1987, C in 1992, C++ in 1995, using “high level API’s” like OpenGL in 97, catching the pattern craze in ’98, the component mantra in 99, coding for 3 consoles at once in 2000, moving into Java, C#, and javascript on the top of C++ in the 21st century. And more!

Let’s go back to 1982:

This is where it all began for me. I had received my first computer a Vic-20 and dreamed of everything I could do with it. William Shatner had advertised it as a gaming system and a computer too (and we all know he has integrity from the whole priceline fiasco). I remember writing my first basic program for it from the thin manual that came with it. An animated bird that used character graphics to flap it’s wings and fly around the screen. I was mesmerized!

I quickly learned the machine inside and out, bought the Vic-20 Programmer’s Reference Guide, hung the schematic of the machine on my wall and went to work. Very quickly I realized that a 1 mhz machine with 5k of ram was going to take some heavy tricks to create games for. So I learned 6502 machine language to really pull the stops out. An 8 bit processor with an X, Y and Accumulator that could add and subtract any 8 bit value! X and Y could only increment and decrement. And of course when the numbers were larger than 255 a little juggling was required.

Very soon the Commodore 64 came out and I immediately upgraded. Programmable sprites, a SID chip for sound, 64k of ram (with 16k bank switched under rom that held Bill’s basic and the “OS”). Now we were playing with power. I wrote more than a few games for the Commodore 64 as a hobbiest in straight machine code. An Olympic games type clone, a chess game that worked with the 300 baud modem, a utility to dump on screen graphics to the printer, some arcade variations, a sprite editor.
These were the truly brass tacks years. 6502 machine language mixed with little bits of bootstrap basic was hardcore by any standards. Jumping to subroutines was something you couldn’t take lightly and the “hardware” stack was only 512 bytes long. It was strictly for keeping the return address from subroutine calls. I was firmly convinced games required maximum power and could only be written in machine language. I was pretty convinced this fact would never change.

I was of course wrong.

Fast forward to 1987:

The IBM PC/XT was starting to show promise and a friend introduced me to Turbo Pascal on it. 4.77 megahertz! 640k of ram. Sure my machine was an XT and only capable of CGA graphics (and character mode graphics – not even programmable at that). But Turbo Pascal was my first true compiler. It was actually possible to write a significant chunk of many games in high level code. Sure low-level interrupt stuff was often in assembly. We spent many, many all nighters coding games and often ignoring our girlfriends. But they couldn’t code:) I slowly got used to the fact that passing variables on the stack was a useful enough concept that it was worth the performance hit it sometimes had on games. We are talking about functions passing parameters on the stack! But this was “non-trivial” and took cycles. Things were still so tight that every cycle mattered.

Fast forward to 1992:

I got my first 386 20 mhz clone with 4 meg of memory and an 80 meg hard drive. I quickly bought a soundblaster card for it. Before long I was coding games in MCGA 320×200 256 color graphics mode. I learned everything I could about palette manipulation and palette color theory. I was coding in C by this time as well 8086 assembly of course. You could actually inline your assembly right there with the C code. It was awesome. I wrote a sprite compiler that would literally turn all the data into straight machine code so it didn’t have to loop and check for key colors. It used a triple buffer, dirty rect system to update the screen as efficiently as possible.

I learned Autodesk animator inside and out and spent hundreds of hours working on a sprite editor with bucket fills, circles, lines and palette manipulation functions as well as animation capability. I was in heaven. I recruited some good friends to do graphics for me and created a number of demos to show off the library.

I still hand-optimized everything and didn’t trust the compilers at all. A C compiler may not use a bit shift for an integer multiplication. The horror!

I made sure and learned all I could about fixed point numbers. At this point floats were still done in software and slower than hell. Since C didn’t support operator overloading the code looked pretty clunky. But it didn’t matter. 320×200 256 color mode was fast – really fast. And I was learning all I could from Abrash books about mode 13 hex that allowed the full 256k of graphics memory to be utilized. At the time the PC was still back switching 64k into it’s memory map. And don’t even get me started on 80×86 segment offset addressing!

I ended up releasing all my code as shareware and learning the ins and outs of shareware distribution mechanisms. I remember printing labels for 5 1/4 and 3 1/2 disks and mailing them off to shareware distribution hubs. This was before the internet took off of course and everything was BBS based.

1995 was right around the corner and soon I would find myself working at Legend Entertainment on Star Control 3 – my first commercial game product.

Part 2 cracks the lid open on my game industry years.

How null breaks polymorphism; or the problem with null: part 1

Preface: After talking to a number of people I realize that somehow I managed to misrepresent myself with respect to type systems in this article. This article is an attack on null, and to point out that null is still problematic in many (if not all) strongly typed languages. Many who prefer strong types and static types feel they are more immune to certain runtime type related inappropriate behaviors. I feel the ability to use null in most languages in place of an object breaks polymorphism in the most extreme possible way. I am in no way implying that dynamic or weak type systems are better at handling these issues. As for me – I prefer languages with stronger compile time type checking.

This is a difficult concept for a lot of died in the wool strong statically typed OO programmers to fully digest and accept. There is an immense sense of pride in the strong statically typed community about the fact that unlike untyped languages, strong statically typed languages protect them from run-time errors related to type mismatches and unavailable methods. Unless you do a dynamic type cast (frowned upon heavily) you should be safe from at least this broad class of error. But they are wrong. Type mismatches and unavailable methods occur all the time in strong statically typed languages. And it is a common form on runtime surprise. What causes this common problem: the null which can be used with any type yet breaks polymorphism with every single one.

Unlike types in loosely typed languages the null is guaranteed not work polymorphicly thus requiring a specific type check. Did I say type check? But I have no dynamic casts, I’m following all the rules. Why should I have any type checks? Checking for null is a type check. It’s the mother of all type checks. Instead of having code littered with conditional checks for types and branches based on those types (an OO worst practice) you have code littered with conditional checks for null having branches based on whether it is null or not.

Now granted, life in a world without nulls isn’t easy and I use null often myself. It’s too tempting to use this magic value instead of writing code more appropriately. Some will mention the null object design pattern that “does nothing” with pride as a solution to this problem. These are in fact polymorphic. The only issue is that null objects only work in special circumstances. If you really don’t have a thing you shouldn’t be pretending you do and having it do nothing. You should have a separate chain of logic that doesn’t use the thing you don’t have.

I have talked to a number of coders that think that removing null from a majority of their code would be difficult to impossible. A difficult to grok kind of problem perhaps but intractable, no. Consider the following function:

int DoSomethingSpecific( int x, int y, int z);

Now I will asked the magic question. Do you check z for null in case you don’t have it? (or x or y for that matter). In C++ that isn’t even possible because it’s passed by value. If an appropriate default exists for z you may set z to that default before it is called. However plenty of times that concept isn’t the one you are looking for. What do you do? You simply write another function that doesn’t take z.

int DoSomethingSpecific(int x, int y);

Now let’s use generic objects:

int DoSomethingSpecific(object x, object y, object z);

int DoSomethingSpecific(object x, object y);

Using this approach doesn’t break polymorphism. You only call the appropriate function when you actual have the parameters in question.

Of course this brings us back to a more fundamental problem. The concept of null is so burned in to most OO languages that visual inspection of code reveals that most any object should be nullable and thus checked for null. C++ has a way around this with references that can not be null or checked for null (yes I know many compilers will let you assign null but you make clear your intent in using a reference:it should not be null). The C++ reference being used this way is at best an afterthought in the language. These references can’t be reassigned and thus are limited to incoming parameters on function calls in many cases.

Even if you create a class which prohibits non-null assignment casual readers of your code in many languages will miss this fact and do gratuitous checks for null anyways; defeating much of the purpose. The key is supporting syntax that makes clear the fact that an object can not be null. But that discussion is for another day.

In part two of this article I will explain many of the misconceptions and supposedly intractable issues related to removing null. It’s not as hard as you might at first think. I will also further explore the syntax issue, or without language support at least a possible naming convention.

The D Programming Language

The D Programming Language – a Pleasant Surprise

From the D website,

“D is a systems programming language. Its focus is on combining the power and high performance of C and C++ with the programmer productivity of modern languages like Ruby and Python. Special attention is given to the needs of quality assurance, documentation, management, portability and reliability.

D is statically typed, and compiles direct to native code. It’s multi-paradigm: supporting imperative, object-oriented, and template meta-programming styles. It’s a member of the C syntax family, and its look and feel is very close to C++’s.”

This language first piqued my interest about 2 years ago. At the time I saw it as a great idea but wanted to wait to see if it would catch on at all. Since then I have been hearing about it from time to time and finally decided it was time to take a close look. And I’m really glad I did.

This article mainly compares C++ to D from the perspective of what C++ is missing that D has. If you haven’t read my last article about what I love about C++, you’ll probably want to do that first.

Favorite additions C++ is missing:

  1. Garbage Collection – this is almost always faster and more efficient use of runtime cycles and the developer’s development cycles; a total no-brainer in most cases. You can do manual memory allocation when you think you know better.
  2. Nested functions – The nested function has full access to the local variables of the function it is nested in. Being able to break up large functions in the scope of the calling function makes things sensible and clean.
  3. Inner (adaptor) classes – Nested classes support access to the variable scope of the calling class. Use a static nested class to prohibit access.
  4. Properties – Being able to use the syntactic sugar of values with underlying functions makes good sense. Why shouldn’t value be able to be abstracted out and have underlying functions?

5. foreach – cycle through all the elements and let the compiler decide the most efficient way.

  1. Implicit Type Inference – this is coming to C# soon as well. It’s nice not to have to specify the type when it is already specified by what it it’s being set equal to.
  • Strong typedefs – This one is especially nice. You want to create a typedef that isn’t considered the same as the original as function signatures are concerned. C++ forces you to create a class to do this thing that should be simple.
  • Contract Programming – puts your in and out contract constraints into the code in a clean, consistent way. The compiler can now better optimize and inherit contract constraints.

  • Unit testing – makes it simple to include unit tests directly within each class to validate it.

  • Static construction order – Being able to explicitly define in what order static objects are created in shouldn’t be too much to ask. I’ve seen more than a few projects bit by this in C++. In C++ you have no guarantee when statics will be initialized. Interdependencies in C++ can leave you shaking your head as you unravel the evil.

  • Guaranteed initialization – you have to explicitly say that you want no initialization for performance reasons. 95% of the time not initializing is a mistake – let the compiler do it for you.

  • No Macro text preprocessor – the source of so much potential ugliness – gone!

  • Built-in strings – sure the STL has them but being built-in to the language certainly seems like a reasonable way to go.

  • Array bounds checking – built-in support for checking bounds on all arrays. Turn it on or off – very nice. How many times in C++ have you wished you could flip a switch and make sure nothing was going out of range at runtime. Maybe you should use the STL all the time, but I don’t know anyone who doesn’t use built-in arrays at least some of the time.

  • Nice to haves C++ is missing:

    1. Function delegates – a convenient way to point at member functions.
  • Resizable arrays – being built-in to the language is the way to go for such a basic operation.
  • Array slicing – another minor convenience.

  • Associative arrays – an STL plus that is built-in to the language.

  • String switches – its definitely convenient at times.

  • Thread synchronization primitives – with the prevalence of threads having basic sync support in the language is a timesaver.

  • Type-safe variadic arguments – gets around C++ clunky access to an unknown number of arguments.

  • Documentation comments – a consistent way of documenting code.

  • Try-catch-finally blocks – I’m still not a big fan of exceptions. I blame it on my background in-game programming.

  • There are more advantages over C++, but I just wanted to mention a few of the highlights.
    For a complete comparison of D vs. C, C++, C# and Java go to the D website’s comparison

    Doing a little deeper digging under the covers I found a few things I didn’t like – and one was a deal-breaker.

    First of all – all classes in each module have access to each others private data. There is no way to turn this off. This forces a loss of encapsulation – a concept i don’t agree with.

    Unfortunately the deal breaker that would make me prefer Managed C++ from Microsoft over D is the lack of interop with C++. D is very proud of its simple interop with C. But unlike Microsoft’s Managed C++, D requires cumbersome mechanisms to interop with C++. This is extremely unfortunate and leaves me wishing the C++ standards committee would make C++ over time more like D.

    All in all, D appears to be a very promising language with much to offer. With more straightforward support for coexisting with C++, I would think it would be a shoe-in to eventually gain mainstream popularity. It seems very suitable for performance based coding across multiple platforms. It certainly might be possible to use an auto-wrapper generator like SWIG to bridge C++ libraries to D.

    I will admit that I have only given D a laymen’s overview. I haven’t coded at length in it yet. However I think I’ll end up coding lower level constructs in C++ and using C# for higher level garbage collected code.

    In my mind D is a pleasant surprise but doesn’t quite fit the bill. I would love to see more convenient interfacing with C++ to the extent that such seamless interfacing is feasible.

    Why I Still Love C++

    This article is actually a precursor to my article on the D programming language. I lay the foundations for my interest in D by talking about what I love about C++.

    I have been coding in C++ for a long time now. Even though from the start the language had certain warts I didn’t like, certain aspects of it were irresistible to me and made it top dog.

    1. An expressive language – Once learned, few languages are intrinsically as expressive as C++. It has a million nuances (like the English language) that somehow make expressing exactly what you want to say seem easier than other languages. Having true const support, multiple inheritance (in those rare circumstances where it makes sense) and references that can never be null are just a few of the unique ways that it supports saying exactly what you mean. Even if saying it sometimes may seem to the novice a little verbose and complicated. How many times have you checked for null in a Java or C# program and gritted your teeth knowing that the value should in fact NEVER be null. In C++ once you define a reference you have made a contract with the compiler expressing your intent. The language spec will not let you check for null.
    2. Multi-paradigm – A Even though 10 years ago I jumped on the OO paradigm in a big way, single paradigm language never could quite cut it. Even though generics exist in other languages, few are as strong as C++. Simpler, yes. Powerful, no. Support for objects and generics is a must in my book. Supporting a simple functional style is important as well. Some concepts in the real world are both functional and freestanding. Granted – it is rare that I use freestanding functions, but there are times when they can reduce coupling in a big way. In object insistent languages you are forced to wrap these guys in a class to “objectify” them. This is silly at best.

    3. Supporting low-level constructs – someday this will go away. But doing a good amount of 3D coding I still like to be able to optimise things pretty tightly sometimes. The day has come and gone when assembly or flat C was a must in this category. But C++ is still very useful in this way.

    4. Huge open-source support – let’s face it C  and C++ are still the reigning kings here.

    5. Ability to interface with C and C++ effortlessly – given this “isn’t quite fair”. Of course C++ works well with C and C++. This is important because of #4. Many languages support clumsy bindings to C++ and C and this adds a layer of inconvenience to “get at” the wealth of open-source code already out there.
    Many other aspects of C++ have been absorbed by other languages. I like the availability of generics. Being able to force strong typing when it is important is key to me – like const correctness. You know exactly what you want. Being able to also use generics and avoid strong typing is equally important. Your intent is clear. C# is a close runner-up for me in many respects – the two main things that makes it difficult to fully embrace include clumsy bindings with C++ and its lack of ubiquity on other platforms. The thought of writing a large body of code in C# and then having to rewrite it for say the Wii or a handheld makes me cringe. Mono may help, but as far as I’m aware the jury is STILL out on how widely accepted it will be.

    For the certain types of coding C++ is still essential to me. It’s a love hate relationship. I look forward to the day when everything I enjoy about C++ is in a language that removes the things I truly dislike.

    In this article’s follow-up I’ll talk about the D programming language. You might be pleasantly surprised by what it brings to the table. There I will mention the things that D does right and C++ clearly does not.