Why does null exist in .NET?
Why can values be null in .NET? Is this s开发者_运维知识库uperior to having a guarantee where everything would have a value and nothing call be null?
Anyone knows what each of these methodologies are called?
Either way, I am not very knowledgeable on this, but wouldn't having a value for everything makes things easier, in terms of simplicity, i.e. eliminating null checks, and being able to write more streamlined algorithms that doesn't have to branch out for checks.
What are the pros and cons of each style in terms of performance, simplicity, parallellism, future-proofing, etc.
We've got Tony Hoare, an early pioneer that worked on Algol to thank for that. He rather regrets it:
I call it my billion-dollar mistake. It was the invention of the null reference in 1965. At that time, I was designing the first comprehensive type system for references in an object oriented language (ALGOL W). My goal was to ensure that all use of references should be absolutely safe, with checking performed automatically by the compiler. But I couldn't resist the temptation to put in a null reference, simply because it was so easy to implement. This has led to innumerable errors, vulnerabilities, and system crashes, which have probably caused a billion dollars of pain and damage in the last forty years.
A billion is a low-ball number, I think.
UPDATE: C# version 8 and .NETCore have a decent solution for this problem, check out non-nullable reference types.
As appealing as a world without null
is, it does present a lot of difficulty for many existing patterns and constructs. For example consider the following constructs which would need major changes if null
did not exist
- Creating an array of reference types ala:
new object[42]
. In the existing CLR world the arrays would be filled withnull
which is illegal. Array semantics would need to change quite a bit here - It makes
default(T)
useful only whenT
is a value type. Using it on reference types or unconstrained generics wouldn't be allowed - Fields in a struct which are a reference type need to be disallowed. A value type can be 0-initialized today in the CLR which conveniently fills fields of reference types with
null
. That wouldn't be possible in a non-null world hence fields whos type are reference types in struct's would need to be disallowed
None of the above problems are unsolvable but they do result in changes that really challenge how developers tend to think about coding. Personally I wish C# and .Net was designed with the elimination of null but unfortunately it wasn't and I imagine problems like the above had a bit to do with it.
This reminds me of an episode of James Burke's "Connections" series where monks were transcribing arabic to latin and first encountered a zero digit. Roman arithmetic did not have a representation for zero, but arabic/aramaic arithmetic did. "Why do we have to write a letter to indicate nothing?" argued the Catholic monks. "If it is nothing, we should write nothing!"
Fortunately for modern society, they lost the argument and learned to write zero digits in their maths. ;>
Null simply represents an absence of an object. There are programming languages which do not have "null" per se, but most of them do still have something to represent the absence of a legitimate object. If you throw away "null" and replace it with something called "EmptyObject" or "NullNode", it's still a null just with a different name.
If you remove the ability for a programming language to represent a variable or field that does not reference a legitimate object, that is, you require that every variable and field always contain a true and valid object instance, then you make some very useful and efficient data structures awkward and inefficient, such as building a linked list. Instead of using a null to indicate the end of the linked list, the programmer is forced to invent "fake" object instances to serve as list terminals that do nothing but indicate "there's nothing here".
Delving into existentialism here, but: If you can represent the presence of something, then isn't there a fundamental need to be able to represent the absence of it as well?
I speculate that their exists null
in .NET because it (C#) followed in the C++/Java foot-steps (and has only started to branch-out in more recent versions) and VB/J++ (which became VB.NET/J#) already had the notion of "nothing" values -- that is, .NET has null
because of what was and not because of what it could have been.
In some languages there is no notion of null
-- null
can be completely replaced with a type like Maybe -- there is Something (the object) or Nothing (but this is not null
! There is no way to get the "Nothing" out of an Maybe!)
In Scala with Option:
val opt = Some("foo") // or perhaps, None
opt match {
case Some(x) => x.toString() // x not null here, but only by code-contract, e.g. Some(null) would allow it.
case _ => "nothing :(" // opt contained "Nothing"
}
This is done by language design in Haskell (null
not possible ... at all!) and by library support and careful usage such as in Scala, as shown above. (Scala supports null
-- arguably for Java/C# interop -- but it is possible to write Scala code without using this fact unless null
is allowed to "leak" about).
Edit: See Scala: Option Pattern, Scala: Option Cheat Cheet and SO: Use Maybe Type in Haskell. Most of the time talking about Maybe in Haskell brings up the topic of Monads. I won't claim to understand them, but here is a link along with usage of Maybe.
Happy coding.
Ok now wrap to the magic word of C#-without-null
class View
{
Model model;
public View(Model model)
{
Console.WriteLine("my model : {0}, thing : {1}", this.model, this.model.thing);
this.model = model;
}
}
What is printed on the console?
- Nothing an exception about accessing an un-initialized object is thrown : Ok call this a NullReferenceException and that's the current world.
- It doesn't build, the user needed to specify a value when declaring model, see last bullet point as it create the same result.
- Some default for the model and some default for the thing : Ok before with null we at least had a way to know if an instance was correct now the compiler is generating strange look-alikes that don't contain anything but are still invalid as model objects...
- Something defined by the type of the objects : Better as we could define a specific invalid state per object but now each object that could be invalid need to independently implement this along with a way to identify this state by the caller...
So basically for me it don't seem to solve anything to remove a null state, as the possibly invalid state still need to be managed anyway...
Oh buy the way what would be the default value of an interface ? Oh and an abstract class what would happen when a method is called on a default value that is defined in the abstract class but that call another method that is abstract ? .... .... Why oh why complicating the model for nothing, it's multiple-inheritance questions all over again !
One solution would be to change the syntax completely to go for a full functional one where the null world doesn't exits, only Maybes when you want them to exists... But it's not a C like language and the multi-paradigm-ness of .Net would be lost.
What might be missing is a null-propagating operator able to return null to model.Thing
when model is null, like model.?.Thing
Oh and for good mesure an answer to your question :
- The current class library evolved after the Microsoft-Java debacle and C# was build as a "better-Java" so changing the type system to remove null references would have been a big change. They already manage to introduce value types and removed manual boxing !
- As the value type introduction show microsoft think a lot about speed... The fact that the the default for all types map to a zero fill is really important for fast array initialization for example. Otherwise initialization of arrays of reference values would have need special threatment.
- Without null interop with C would not have been possible so at least at the MSIL level and in unsafe block they need to be allowed to survive.
- Microsoft wanted to use the framework for VB6++ removing
Nothing
as it is called in VB would have radically changed the language, it already took years for users to switch from VB6 to VB.Net such a paradigm change might have been fatal for the language.
Well, values (value-type vars) can only be null
since the nullable types were introduced in Fx2.
But I suppose you mean:
Why can references be null
?
That is part of the usefulness of references. Consider a Tree or LinkedList, they would not be possible (unable to end) without null
.
You could come up with many more examples, but mainly null
exists to model the concept of 'optional' properties/relationships.
Hysterical raisins.
It's a hangover from C-level languages where you live on explicit pointer manipulation. Modern declarative languages (Mercury, Haskell, OCaml, etc.) get by quite happily without nulls. There, every value has to be explicitly constructed. The 'null' idea is handled through 'option' types, which have two values, 'no' (corresponding to null) and 'yes(x)' (corresponding to non-null with value x). You have to unpack each option value to decide what to do, hence: no null pointer reference errors.
Not having nulls in a language saves you so much grief, it really is a shame the idea still persists in high-level languages.
I'm not familiar with alternatives either, but I don't see a difference between Object.Empty and null, other than null lets you know that something is wrong when your code tries to access the object, wheras Object.Empty allows processing to continue. Sometimes you want one behavior, and sometimes you want the other. Distinguishing null from Empty is a useful tool for that.
To denote the nothingness concept, since 0 is not the right fit.
Now you can give any value type a Null value by defining a nullable type.
I think we can't have a value always for the variable because at first we have to default it to some value, and here comes the question why a specific value takes advantage over the others.
Many people probably can't wrap their head around coding without nulls, and if C# didn't have nulls, I doubt it would have caught on to the extent that it has.
That being said, a nice alternative would be, if you want to allow a nullable reference, then the reference would have to be explicitly nullable, like with value types.
For example,
Person? person = SearchForPersonByFirstName("Steve");
if (person.HasValue)
{
Console.WriteLine("Hi, " + person.Value.FullName);
}
Unfortunately, when C# 1.0 came out, there was no concept of Nullable; that was added in C# 2.0. Forcing references to have values would have broken the older programs.
null
is just the name of the default value for a reference type. If null
was not allowed, then the concept of "doesn't have a value" wouldn't go away, you would just represent it in a different way. In addition to having a special name, this default value also has special semantics in the event that it is misused - i.e. if you treat it like there is a value, when in fact there is not.
If there were no null
:
- You would need to define sentinel default values for your types to use when "there is no value". As a trivial example, consider the last node of a forward singly-linked list.
- You would need to define semantics for operations performed on all of these sentinel values.
- The runtime would have additional overhead for handling the sentinel values. In modern virtual machines such as the ones used by .NET and Java, there is no overhead for null-checking prior to most function calls and dereferences because the CPU provides special ways to handle this case, and the CPU is optimized for cases where the references are non-null (e.g. branch prediction).
In summary:
- The name would change, but the concept would not.
- The burden of defining and communicating the default value and semantics for cases where you have "no value" is placed on developers and developer documentation.
- Code executing on modern virtual machines would face code bloat and substantial performance degradation for many algorithms.
The problems with null
described by Tony Hoare are typically due to the fact that prior to modern virtual machines, runtime systems did not have nearly as clean handling of misused null
values like you have today. Misusing pointers/references does remain a problem, but tracking down the problem when working with .NET or Java tends to be much easier than it used to be in e.g. C.
精彩评论