The Forum for Discussion about The Third Manifesto and Related Matters

You need to log in to create posts and topics.

Null, NaN, NaV, and 2VL

I am going to assume for now that SQL NULL represents one of three things:  (a) a value which is knowable but presently not known, like John Cowan's death date; (b) something that is not applicable to the tuple it is in (the fuel efficiency of an all-electric vehicle; (c)  a soft error that does not throw an exception.  I am going to dismiss case (a), as it is in some sense transient, even if the lag time before the value is known may be very large.

The value NaN in the floating-point domain can be used as either (b) or (c); in principle, (a) is possible, but usually floats are in scientific applications where everything is either known or never going to be known.  In the interval interpretation of floats, a float f₁ represents the half-open interval [f₁ - (f₁ - f₀)/2, f1 + (f₂ - f₁)/2), where f₀ and f₂ are the floats just above and just below f, and +Inf and -Inf are the half-open intervals from the largest/smallest float to ±∞.  NaN represents the interval from -∞ to +∞, as in 1.0/0.0, or else the null interval, as in √ᵣ-2 where √ᵣ is the real-valued square root, so it truly is a floating-point value rather than a "mark".

However, it is off the real number line, because every affirmative comparison operator (= < > ≤ ≥) involving NaN returns false.  Thus 1 < NaN is false, -5 >= NaN is false, and in particular NaN = NaN is false.  This means that < is no longer the negation of >= and likewise in all similar cases, which is a bit messy, but could be made easier by adding operators ≠ </ >/ ≤/ ≥/.  This is done because we do not want to assert that 1.0/0.0 =√ᵣ-2 is true.

The extension of infinity and NaN to the domain of exact numbers (integral or fractional) is straightforward, so we can have a value NaEN ("not an exact number").  Complex numbers are more complicated, because either the real part or the imaginary part can be NaN separately (this turns out to be what people doing computation on complex numbers want).  If we have a relation with information about gods and mortals, we may well say that a god's death date is NaD, and perhaps this is also true of some gods' birth dates; by the same token, the age of a god is NaEN.  If "abc" * 3 is "abcabcabc", then "abcabcabd"/3 is NaS.

All this can be generalized to NaV, the (typeless) union of all such values, as long as Boolean values are excluded.  There is no NaB, because booleans might be presently unknown, but they can never be inapplicable, nor can they be the result of a soft error, since operations on booleans are truly closed.  So we have provided for some (not all) of the functions of NULL without gaining th complexities of 3VL.  In particular, 3VL with UNKNOWN makes sense only using an open-world assumption in which a tuple in a relation represents a fact but the absence of a tuple does not represent the falsity of that fact, which is not the normal case.  So even for full-fledged SQL NULL, we can reject the use of 2VL (at the expense of the standard, of course).  In short, the devil may still be a bit sulfurous, but he is none so bad as he has been painted.