How null breaks polymorphism: or the problem with null: part 2
If you haven’t read part one you really need to do that first. It’s here.
I got a lot of interesting feedback on part 1 of this topic and found I needed to further explain myself in certain areas.
Two initial responses to issues brought to me from part 1:
1. Typed languages
In languages that aren’t typed at all, null is no more a problem than any other reassignments as they never care about types. They all can result in similar issues. Which in my mind is in fact a problem. I am definitely a huge proponent compile time type checking. This is why perhaps I seem so hard on it in part 1. Compile time checks should be able to help out.
2. When functions might not return a value
There are plenty of times when a function may or may not return a value. If this is the case the return type should reflect that. Having null used as a “magic number” is not the ideal solution in my mind. I’d much rather see a return type that forces the issue to be very clear. It may seem non-standard but a type masquerading as a collection that potentially has one item seems ideal. Most any programmer will realize he needs to check the size of a collection before trying to access its first element. This is no more cumbersome than checking for null - but seems logically intuitive. This can be easily optimized so that there is no performance hit in a language like C++ using templates. A container is intuitive as it can be potentially empty - this very straightforward.
Making the problem even more clear:
I often find it easier to express programming concepts in real world terms. This helps to reduce them to the absurd when appropriate. This doesn’t always work but a lot of times can help look at the problem from a different perspective.
Let’s take the concept of typical null check behavior and attempt to map it to a real world procedure.
We are going to explore John teaching Norm to drive a car. The following may sound a little familiar.
John: “First you are going to need to make sure you have a car. Do you have a car? If you don’t have a car just return to what you were doing. I won’t be able to teach you to drive a car.
“Norm: “I have a car.”
John: “Let’s call that car normsCar. Check to see if your car has a door. We’ll call it normsCarDoor. Does normsCar have a normsCarDoor?”
Norm: “Yes”
John: “Great, but if you don’t just skip the rest of this - I won’t be able to tell you how to drive a car.”
Norm: “I have a car door.”
John: “Once you open the door check and see if it has a seat we’ll call normsComfySeat. If you don’t have a seat skip the rest of this - I won’t be able to tell you how to drive a car.”
Norm: “I have a seat”
John: “Things have changed scope a bit - can you check whether you have a door again for me - we called it normsCarDoor?”………..
I think you can see where this is going. Classes have contracts. It should be reasonable to talk about a car that always has a door and a seat without having to neurotically check at all times.
Unlike the real world, when we code we make new things up all the time. So even though seats and doors can be reasonably inferred on cars the things that a UserAgentControl may or may not have probably won’t be obvious. Does the UserAgentControl have a ThrottleManager all the time? If I have a non-nullable class type I can check by looking at the class. If I get it wrong for some reason maybe I can have the compiler issue a warning. Maybe I can be forced to “conjugate” it with a special syntax every time I use it to help me remember (”.” vs. “->” in C++). Or a naming convention.
Why is this such an overarching problem?
It may seem like griping over a simple check for a things existence. Except this check can conceivably apply to each and every object I can have (or don’t have). This makes the issue enormous. A lot of the time programmers put in checks for null when it seems appropriate or when unit testing throws an exception at run-time. This sounds a whole lot like something statically typed languages are supposed to help us protect ourselves against. The unchecked null is by far the most common runtime error and the “overchecked” null one of the most prevalent unintentional code obfuscation techniques.
Solutions that don’t involve changing existing languages:
1. If your language supports the concept out of the box (C++, Spec#, F#, OCaml, Nice, etc.) or it can be built with templates or other mechanisms use not-nullable types. Use these types whenever possible. If you’re language doesn’t support it then use number 2 alone.
2. Create a simple naming convention (don’t go hungarian on me) to discern what is nullable and what isn’t. This is a fundamental concept and it should be obvious everytime a value, instance or object is accessed. Use this to compensate for your languages deficiency in a similar manner to prefixing private variables in languages that don’t have mechanisms for hiding access to variables and member functions.
3. Check for null as early as possible in the food chain and prefer to call methods that are granular and don’t have to check themselves. This means make every parameter passed to a routine get used. If a parameter is optional write a new routine. Make the types passed to these routines be not-nullable if the language supports it.
4. If it makes sense prefer empty collections to collections containing nothing but null for obvious reasons.
Wrap-up:
There is no perfect solution. But so many times in code I see null being checked or not checked and am left wondering. Is the check gratuitous? Is this a runtime error waiting in the wings? And unless its commented it or I check it’s usage - I get that typically uneasy feeling. Bound to catch it at runtime with unit testing. This is not the answer I want to hear.
I think a few simple practices and conventions could get this off our plate so we can get back to the problem at hand. Solving actual problems.
Afterthoughts:
Some other articles that beat me to the punch on some of these concepts:
- Let’s Reconsider That Offensive Coding from Michael Feathers.
- Null:The Runtime Error Generator by Blaine Buxton
Empty containers solve this problem nicely for return types. Also as many readers have mentioned, monads are extremely useful in solving this problem as well. But once again, stop propagating unneeded nulls and similar “safer null-like” structures as soon as possible. As Michael Feathers states - it’s just offensive code.










