The Forum for Discussion about The Third Manifesto and Related Matters

You need to log in to create posts and topics.

SUMMARIZE PER, OUTER JOIN and image relations

Erwin was surprised that I "of all people" should find myself in favour of NaN.  But I didn't really say that.  I just said that I had noticed its use in a particular context where it had done me no harm and I found myself grateful for it (in that context).

Dave Voorhis responded to my comment on ROUND(NaN) = zero in TD, saying that he thinks it should return NaN and was considering changing it.  But NaN is of type RATIONAL and ROUND(x) returns INTEGER.

Somebody (Dave V. again?) twitted Erwin for equating NULL and NaN.  I agree with the twitting.  I haven't given a lot of thought to the wisdom or otherwise of NaN as found in Rel and the other language(s) that inspire its use in Rel.  Indeed, I wasn't even aware of its use in existing languages, though its name was familiar to me.  I do note, somewhat wryly, this, that I found in Rel today:

NaN = NaN

false

NOT(NaN = NaN)

true

So that's not like NULL, but TTM requires those truth values to be the other way around (RM Pre 8).

Hugh

Coauthor of The Third Manifesto and related books.
Quote from Dave Voorhis on February 19, 2020, 12:20 pm

In earlier versions of Rel, I elided NaN but it inevitably kept appearing in one way or another. You can either keep throwing unproductively annoying exceptions or give in and allow it. I allowed it.

Be that as it may, the slight irony is that "types" with "values" such as NaN are relatively OK as long as you can be sure you're pretty much [or entirely] at the end of the pipeline.  But someone recently commented that the concept of "end of the pipeline" is gradually just disappearing altogether from the landscape.

Quote from Hugh on February 19, 2020, 2:38 pm

Erwin was surprised that I "of all people" should find myself in favour of NaN.  But I didn't really say that.  I just said that I had noticed its use in a particular context where it had done me no harm and I found myself grateful for it (in that context).

Dave Voorhis responded to my comment on ROUND(NaN) = zero in TD, saying that he thinks it should return NaN and was considering changing it.  But NaN is of type RATIONAL and ROUND(x) returns INTEGER.

Somebody (Dave V. again?) twitted Erwin for equating NULL and NaN.  I agree with the twitting.  I haven't given a lot of thought to the wisdom or otherwise of NaN as found in Rel and the other language(s) that inspire its use in Rel.  Indeed, I wasn't even aware of its use in existing languages, though its name was familiar to me.  I do note, somewhat wryly, this, that I found in Rel today:

NaN = NaN

false

NOT(NaN = NaN)

true

So that's not like NULL, but TTM requires those truth values to be the other way around (RM Pre 8).

Hugh

I expect the "wryly" to increase if it is also observed that NaN <> NaN yields false too (at least I suppose so, taking that suggestion "all comparisons yield false" as the letter of the law) and is thus not the logical negation of NaN = NaN ...

Quote from Hugh on February 19, 2020, 2:38 pm

Erwin was surprised that I "of all people" should find myself in favour of NaN.  But I didn't really say that.  I just said that I had noticed its use in a particular context where it had done me no harm and I found myself grateful for it (in that context).

Dave Voorhis responded to my comment on ROUND(NaN) = zero in TD, saying that he thinks it should return NaN and was considering changing it.  But NaN is of type RATIONAL and ROUND(x) returns INTEGER.

Somebody (Dave V. again?) twitted Erwin for equating NULL and NaN.  I agree with the twitting.  I haven't given a lot of thought to the wisdom or otherwise of NaN as found in Rel and the other language(s) that inspire its use in Rel.  Indeed, I wasn't even aware of its use in existing languages, though its name was familiar to me.  I do note, somewhat wryly, this, that I found in Rel today:

NaN = NaN

false

NOT(NaN = NaN)

true

So that's not like NULL, but TTM requires those truth values to be the other way around (RM Pre 8).

Hugh

It's the effect of Rel's RATIONAL being a thin skin around IEEE 754 floating point. Per an answer on a StackOverflow Q&A cited earlier in this thread, "NaN is used as a sort of placeholder for [the] undefined state. Mathematically speaking, undefined is not equal to undefined. Neither can you say an undefined value is greater or less than another undefined value. Therefore all comparisons return false."

That makes it incompatible with TTM but mathematically correct.

At some point I'll have to replace the numeric types. Replacing the type itself is easy; replacing the built-in library operators is a fairly non-trivial undertaking.

I'd forgotten ROUND returns an INTEGER, so conversion from NaN to integer zero makes sense. If I replace RATIONAL with something more, er, rational, I'll replace ROUND along with it.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
Quote from Dave Voorhis on February 19, 2020, 8:39 pm
Quote from Hugh on February 19, 2020, 2:38 pm

Erwin was surprised that I "of all people" should find myself in favour of NaN.  But I didn't really say that.  I just said that I had noticed its use in a particular context where it had done me no harm and I found myself grateful for it (in that context).

Dave Voorhis responded to my comment on ROUND(NaN) = zero in TD, saying that he thinks it should return NaN and was considering changing it.  But NaN is of type RATIONAL and ROUND(x) returns INTEGER.

Somebody (Dave V. again?) twitted Erwin for equating NULL and NaN.  I agree with the twitting.  I haven't given a lot of thought to the wisdom or otherwise of NaN as found in Rel and the other language(s) that inspire its use in Rel.  Indeed, I wasn't even aware of its use in existing languages, though its name was familiar to me.  I do note, somewhat wryly, this, that I found in Rel today:

NaN = NaN

false

NOT(NaN = NaN)

true

So that's not like NULL, but TTM requires those truth values to be the other way around (RM Pre 8).

Hugh

It's the effect of Rel's RATIONAL being a thin skin around IEEE 754 floating point. Per an answer on a StackOverflow Q&A cited earlier in this thread, "NaN is used as a sort of placeholder for [the] undefined state. Mathematically speaking, undefined is not equal to undefined. Neither can you say an undefined value is greater or less than another undefined value. Therefore all comparisons return false."

That makes it incompatible with TTM but mathematically correct.

At some point I'll have to replace the numeric types. Replacing the type itself is easy; replacing the built-in library operators is a fairly non-trivial undertaking.

I'd forgotten ROUND returns an INTEGER, so conversion from NaN to integer zero makes sense. If I replace RATIONAL with something more, er, rational, I'll replace ROUND along with it.

Did you really mean "mathematically correct"?  Has there been a respected mathematical treatise of NaN, not in a computer science context?  Mathematical treatments of infinities don't have universal agreement among mathematicians, so I would be surprised if the same weren't true of NaN.  Anyway, here's some more.

I have an operator is_int(x RATIONAL) RETURNS BOOLEAN that yields TRUE if rounding x to six places of decimals yields an integer.  The result of is_int(NaN)  is TRUE.  Here's the code:

OPERATOR is_int(r RATIONAL) RETURNS BOOLEAN; return rat ( ROUND ( Round6 ( r ) ) ) = Round6 ( r ) ; end operator ;
OPERATOR rat(n INTEGER) RETURNS RATIONAL; return CAST_AS_RATIONAL ( n ) ; end operator ;
OPERATOR Round6(x RATIONAL) RETURNS RATIONAL; return ( CAST_AS_RATIONAL ( ROUND ( ( x * 1000000.0 ) ) ) / 1000000.0 ) ; end operator ;

Round6(NaN) gives 0.0, so ROUND ( Round6 ( NaN) ) gives the integer 0, which is why rat ( ROUND ( Round6 ( r ) ) ) = Round6 ( r ) gives TRUE.

The justification for my is_int operator is that Rel sometimes gives results that are incorrect in the 18th decimal place (or a few places earlier) and I want to see all results that really stand for integers.

Do you also really mean "makes sense"?  The best I can find about your IEEE treatment is that I have found it convenient in at least one particular context.  I would hesitate to call that making sense, exactly.

Hugh

Coauthor of The Third Manifesto and related books.
Quote from Hugh on February 20, 2020, 12:49 pm
Quote from Dave Voorhis on February 19, 2020, 8:39 pm
Quote from Hugh on February 19, 2020, 2:38 pm

Erwin was surprised that I "of all people" should find myself in favour of NaN.  But I didn't really say that.  I just said that I had noticed its use in a particular context where it had done me no harm and I found myself grateful for it (in that context).

Dave Voorhis responded to my comment on ROUND(NaN) = zero in TD, saying that he thinks it should return NaN and was considering changing it.  But NaN is of type RATIONAL and ROUND(x) returns INTEGER.

Somebody (Dave V. again?) twitted Erwin for equating NULL and NaN.  I agree with the twitting.  I haven't given a lot of thought to the wisdom or otherwise of NaN as found in Rel and the other language(s) that inspire its use in Rel.  Indeed, I wasn't even aware of its use in existing languages, though its name was familiar to me.  I do note, somewhat wryly, this, that I found in Rel today:

NaN = NaN

false

NOT(NaN = NaN)

true

So that's not like NULL, but TTM requires those truth values to be the other way around (RM Pre 8).

Hugh

It's the effect of Rel's RATIONAL being a thin skin around IEEE 754 floating point. Per an answer on a StackOverflow Q&A cited earlier in this thread, "NaN is used as a sort of placeholder for [the] undefined state. Mathematically speaking, undefined is not equal to undefined. Neither can you say an undefined value is greater or less than another undefined value. Therefore all comparisons return false."

That makes it incompatible with TTM but mathematically correct. ...

 

Did you really mean "mathematically correct"?  Has there been a respected mathematical treatise of NaN, not in a computer science context? ...

While we're at it ... Is there an IEEE standard for representing irrationals accurately? (Presumably symbolically.) I'm thinking wrt converting between alternative PossReps for Polar/Cartesian. (Polar in Radians.) I expect you're very likely to get the scenario Hugh describes of a round trip conversion ending up with inaccuracies.

Quote from AntC on February 20, 2020, 1:55 pm
Quote from Hugh on February 20, 2020, 12:49 pm
Quote from Dave Voorhis on February 19, 2020, 8:39 pm
Quote from Hugh on February 19, 2020, 2:38 pm

Erwin was surprised that I "of all people" should find myself in favour of NaN.  But I didn't really say that.  I just said that I had noticed its use in a particular context where it had done me no harm and I found myself grateful for it (in that context).

Dave Voorhis responded to my comment on ROUND(NaN) = zero in TD, saying that he thinks it should return NaN and was considering changing it.  But NaN is of type RATIONAL and ROUND(x) returns INTEGER.

Somebody (Dave V. again?) twitted Erwin for equating NULL and NaN.  I agree with the twitting.  I haven't given a lot of thought to the wisdom or otherwise of NaN as found in Rel and the other language(s) that inspire its use in Rel.  Indeed, I wasn't even aware of its use in existing languages, though its name was familiar to me.  I do note, somewhat wryly, this, that I found in Rel today:

NaN = NaN

false

NOT(NaN = NaN)

true

So that's not like NULL, but TTM requires those truth values to be the other way around (RM Pre 8).

Hugh

It's the effect of Rel's RATIONAL being a thin skin around IEEE 754 floating point. Per an answer on a StackOverflow Q&A cited earlier in this thread, "NaN is used as a sort of placeholder for [the] undefined state. Mathematically speaking, undefined is not equal to undefined. Neither can you say an undefined value is greater or less than another undefined value. Therefore all comparisons return false."

That makes it incompatible with TTM but mathematically correct. ...

Did you really mean "mathematically correct"?  Has there been a respected mathematical treatise of NaN, not in a computer science context? ...

While we're at it ... Is there an IEEE standard for representing irrationals accurately? (Presumably symbolically.) I'm thinking wrt converting between alternative PossReps for Polar/Cartesian. (Polar in Radians.) I expect you're very likely to get the scenario Hugh describes of a round trip conversion ending up with inaccuracies.

I haven't found such a standard, but I didn't look very hard.

IEEE 754 floating point is pervasive because on most computer architectures it's wired into the hardware. Wolfram Language (the language in Mathematica) is symbolic (Mathematica generally isn't used where performance is an issue) -- and there are no doubt others -- but mainstream languages (and Rel) typically use the easy way to implement non fixed-point numbers: they use the IEEE 754 floating-point implementation in the CPU itself.

IEEE 754 provides an acceptable compromise between performance, precision, accuracy and size. It was designed for fast calculations on real-world measures that vary widely in size, where absolute accuracy is not a priority -- which it almost never is with real-world measures.

For that, it does well. Just don't use it for money or round-trip conversions.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
Quote from Hugh on February 20, 2020, 12:49 pm
Quote from Dave Voorhis on February 19, 2020, 8:39 pm
Quote from Hugh on February 19, 2020, 2:38 pm

Erwin was surprised that I "of all people" should find myself in favour of NaN.  But I didn't really say that.  I just said that I had noticed its use in a particular context where it had done me no harm and I found myself grateful for it (in that context).

Dave Voorhis responded to my comment on ROUND(NaN) = zero in TD, saying that he thinks it should return NaN and was considering changing it.  But NaN is of type RATIONAL and ROUND(x) returns INTEGER.

Somebody (Dave V. again?) twitted Erwin for equating NULL and NaN.  I agree with the twitting.  I haven't given a lot of thought to the wisdom or otherwise of NaN as found in Rel and the other language(s) that inspire its use in Rel.  Indeed, I wasn't even aware of its use in existing languages, though its name was familiar to me.  I do note, somewhat wryly, this, that I found in Rel today:

NaN = NaN

false

NOT(NaN = NaN)

true

So that's not like NULL, but TTM requires those truth values to be the other way around (RM Pre 8).

Hugh

It's the effect of Rel's RATIONAL being a thin skin around IEEE 754 floating point. Per an answer on a StackOverflow Q&A cited earlier in this thread, "NaN is used as a sort of placeholder for [the] undefined state. Mathematically speaking, undefined is not equal to undefined. Neither can you say an undefined value is greater or less than another undefined value. Therefore all comparisons return false."

That makes it incompatible with TTM but mathematically correct.

At some point I'll have to replace the numeric types. Replacing the type itself is easy; replacing the built-in library operators is a fairly non-trivial undertaking.

I'd forgotten ROUND returns an INTEGER, so conversion from NaN to integer zero makes sense. If I replace RATIONAL with something more, er, rational, I'll replace ROUND along with it.

Did you really mean "mathematically correct"?  Has there been a respected mathematical treatise of NaN, not in a computer science context?  Mathematical treatments of infinities don't have universal agreement among mathematicians, so I would be surprised if the same weren't true of NaN.  Anyway, here's some more.

I probably should have written "numerical representation-ally correct", because we're talking about the branch of computer science that deals with numeric representation. The purely mathematical equivalent is "undefined". There's long been a certain back-and-forth between the pure mathematics folks and the numerical representation folks, but generally the former recognise that the latter are subject to inevitable physical and performance constraints.

The treatment of NaN has been standard since 1985. I don't think it's going to change, though I recall there are some suggestions of extensions that can be made on top of the usual hardware implementations that may be helpful. When I get a moment, I'll remind myself of those -- it's been years since I've looked at any of this stuff in detail.

I have an operator is_int(x RATIONAL) RETURNS BOOLEAN that yields TRUE if rounding x to six places of decimals yields an integer.  The result of is_int(NaN)  is TRUE.  Here's the code:

OPERATOR is_int(r RATIONAL) RETURNS BOOLEAN; return rat ( ROUND ( Round6 ( r ) ) ) = Round6 ( r ) ; end operator ;
OPERATOR rat(n INTEGER) RETURNS RATIONAL; return CAST_AS_RATIONAL ( n ) ; end operator ;
OPERATOR Round6(x RATIONAL) RETURNS RATIONAL; return ( CAST_AS_RATIONAL ( ROUND ( ( x * 1000000.0 ) ) ) / 1000000.0 ) ; end operator ;

Round6(NaN) gives 0.0, so ROUND ( Round6 ( NaN) ) gives the integer 0, which is why rat ( ROUND ( Round6 ( r ) ) ) = Round6 ( r ) gives TRUE.

The justification for my is_int operator is that Rel sometimes gives results that are incorrect in the 18th decimal place (or a few places earlier) and I want to see all results that really stand for integers.

Do you also really mean "makes sense"?  The best I can find about your IEEE treatment is that I have found it convenient in at least one particular context.  I would hesitate to call that making sense, exactly.

Hugh

IEEE 754 sacrifices accuracy for size and performance. There are certain numbers that can't be exactly represented in IEEE 754, such as 6.1, 0.1, etc.

I do mean "makes sense", as there is no NaN in canonical integers. Canonical integers are intended for counters and loop iterators, so successful conversion to integer happens in as many cases as possible. Thus, it's conventional -- in many languages -- to cast text, floating point NaN, or anything not unambiguously numeric to integer zero.

My thinking is this:

  1. Replace Rel's current RATIONAL with an implementation underpinned by Java's BigDecimal, which will give arbitrary precision and the basic arithmetic operators (+ - × ÷). Other operators (e.g., standard trig?) may be provided if I can provide reasonable assurances of correct behaviour. Division by zero will throw an exception.
  2. Add FLOAT, underpinned by the canonical IEEE 754, which will have the canonical set of floating-point operators, i.e., basic arithmetic, trigonometric, etc. It will have the same functionality as that currently provided by Rel's existing RATIONAL.
  3. Provide appropriate CAST_AS_xxx operators to convert between them. Casting from FLOAT's NaN to RATIONAL will delete everything on your hard drive.

Ok, maybe not the last one. Casting from FLOAT's NaN to RATIONAL will throw an exception.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org

I recently added NaN to DuroDBMS. NaN = NaN yields true, giving RM Pre 8 priority over IEEE 754.

Right now I discovered that CAST_AS_INTEGER(NaN) = -2147483648. Maybe raising an error would be better.

FLOAT and RATIONAL are synonyms. Maybe one day I will separate them, using an arbitrary-precision rational type for RATIONAL. Since DuroDBMS is based on C, the GNU Multiple Precision Arithmetic Library looks like a reasonable choice.

IEEE 754 sacrifices accuracy for size and performance. There are certain numbers that can't be exactly represented in IEEE 754, such as 6.1, 0.1, etc.

Yes.

I do mean "makes sense", as there is no NaN in canonical integers. Canonical integers are intended for counters and loop iterators, so successful conversion to integer happens in as many cases as possible. Thus, it's conventional -- in many languages -- to cast text, floating point NaN, or anything not unambiguously numeric to integer zero.

Yes. Zero is almost always a better fit than raising an exception. C# has an int? which is sometimes useful.

My thinking is this:

  1. Replace Rel's current RATIONAL with an implementation underpinned by Java's BigDecimal, which will give arbitrary precision and the basic arithmetic operators (+ - × ÷). Other operators (e.g., standard trig?) may be provided if I can provide reasonable assurances of correct behaviour. Division by zero will throw an exception.
  2. Add FLOAT, underpinned by the canonical IEEE 754, which will have the canonical set of floating-point operators, i.e., basic arithmetic, trigonometric, etc. It will have the same functionality as that currently provided by Rel's existing RATIONAL.
  3. Provide appropriate CAST_AS_xxx operators to convert between them. Casting from FLOAT's NaN to RATIONAL will delete everything on your hard drive.

Ok, maybe not the last one. Casting from FLOAT's NaN to RATIONAL will throw an exception.

RATIONAL as per BigDecimal would be a good move. You don't need trig or transcendentals or most other high-faluting maths wonk stuff. I would suggest a big subset of the methods provided by BigDecimal would cover most of the bases.

FLOAT as per IEEE is good, except where it conflicts with TTM. IMO the default should be the TTM way (so an exception instead of NaN and friends), but as a configurable option for the desperate.

Numeric type conversion has 3 sensible outcomes: (a) perfect conversion, (b) imperfect conversion and (c) not useful. My preference is to use CAST only for (a), individual named conversion functions for (b) and exceptions for everything else, including range failures of (a) and (b). So RATIONAL=>FLOAT should be a conversion function, but a CAST will succeed for integers and binary fractions (no rounding or loss of precision).

Note that a conversion function will ordinarily have additional arguments to control the conversion.

Andl - A New Database Language - andl.org