First and second class citizens in TTM

#21 · November 9, 2018, 1:18 pm

Rel doesn't allow that, either. The literal 2 must be 2.0. All literals must be fully specified and their types unambiguously recognisable on sight, so to speak.

(straying off the point) There are good reasons for it to be that way, but my intention with Andl was always to have only a single kind of number: simply a string of digits with a decimal point, just like you might write it. Integer, decimal and floating point values are all represented directly.The literals 2 and 2.0 and 0002.0000 represent the very same value (and type).

IMO an integer data type is completely unnecessary: all you need are operators div, round and trunc. Some operators will round or truncate their result at known (or specified) boundaries (eg division, transcendentals).

I leave efficient implementation as an exercise for the reader.

Andl - A New Database Language - andl.org

#22 · November 9, 2018, 3:19 pm

Quote from dandl on November 9, 2018, 1:18 pm
Rel doesn't allow that, either. The literal 2 must be 2.0. All literals must be fully specified and their types unambiguously recognisable on sight, so to speak.
(straying off the point) There are good reasons for it to be that way, but my intention with Andl was always to have only a single kind of number: simply a string of digits with a decimal point, just like you might write it. Integer, decimal and floating point values are all represented directly.The literals 2 and 2.0 and 0002.0000 represent the very same value (and type).

IMO an integer data type is completely unnecessary: all you need are operators div, round and trunc. Some operators will round or truncate their result at known (or specified) boundaries (eg division, transcendentals).

I leave efficient implementation as an exercise for the reader.

I believe Oracle did that (single numeric type) too, and possibly still does. We had to do some weasel wording in the international standard to make their implementation appear conforming.

And we did it in BS12, though internally we used 32-bit binary for integer-only numeric columns in catalog tables.

Hugh

Coauthor of The Third Manifesto and related books.

#23 · November 9, 2018, 3:51 pm

Quote from Hugh on November 9, 2018, 12:41 pm

But the relation would be of cardinality approaching 49,500 (the number of cases GJ and I had to study to complete the survey) ...

8-O

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org

#24 · June 7, 2019, 10:28 pm

If an attribute or variable is of type OPERATOR (i INT, j INT, k INT, l INT) RETURNS INT, it seems a shame to have to write out the signature again in every assignment to that attribute or variable. I realise of course that the signature is required in a literal denoting an operator but I just wondered if the burden could be alleviated in some way (without violating RM Pre 26).

Hugh

I appreciate the desire for brevity, but that would imply a coercion notionally equivalent to being able to perform an assignment like:
VAR x RATIONAL;
x := 2;
Rel doesn't allow that, either. The literal 2 must be 2.0. All literals must be fully specified and their types unambiguously recognisable on sight, so to speak.
As I expected. And even if coercion were supported (which I would definitely oppose) we would need another type for operator body.

Hugh

This is not actually coercion, but a modest kind of type inference, such as Java and C now have: VAR x := 2; declares x and gives it an initial value, but the type of x is inferred by the compiler from the manifest type of the value that initializes it. It is exactly equivalent to the above example, but less verbose, especially when type names are very long.

#25 · June 7, 2019, 11:26 pm

Quote from johnwcowan on June 7, 2019, 10:28 pm
If an attribute or variable is of type OPERATOR (i INT, j INT, k INT, l INT) RETURNS INT, ...

I appreciate the desire for brevity, but that would imply a coercion notionally equivalent to being able to perform an assignment like:
VAR x RATIONAL;
x := 2;
Rel doesn't allow that, either. The literal 2 must be 2.0. All literals must be fully specified and their types unambiguously recognisable on sight, so to speak.
This is not actually coercion, but a modest kind of type inference, such as Java and C now have: VAR x := 2; declares x and gives it an initial value, but the type of x is inferred by the compiler from the manifest type of the value that initializes it. It is exactly equivalent to the above example, but less verbose, especially when type names are very long.

And what is the 'manifest type' of 2? Is it a byte, a short Int, a full-word Int, an arbitrary-precision Integer, a Float, a Double, a Rational (with denominator 1), a Complex ...? Compare

Float pi := 3.14

Float twopi := pi * 2 -- 2 must be Float

Int firstprime := 2 -- 2 must be Int

Note that Tutorial D supports only two number types: INTEGER (abbreviated INT), RATIONAL (abbreviated RAT). As Hugh says, for numeric literals the 'manifest type' is shown by whether there's a fractional part (possibly .0).

For an Industrial strength D, you need to support many number types. You could go decorating numerical literals with all sorts of hieroglyphics to differentiate the types. Or you could say token 2 is polymorphic: it takes its type from the context, just as +, * can apply for different numeric types. That's also not coercion but a "modest kind of type inference"; without needing 'manifest types', only declared types.

#26 · June 8, 2019, 1:39 am

Quote from johnwcowan on June 8, 2019, 1:39 am

For an Industrial strength D, you need to support many number types. You could go decorating numerical literals with all sorts of hieroglyphics to differentiate the types. Or you could say token 2 is polymorphic: it takes its type from the context, just as +, * can apply for different numeric types. That's also not coercion but a "modest kind of type inference"; without needing 'manifest types', only declared types.

A couple of other approaches likewise seem plausible. As one example: let a numeric literal without a decimal point or exponent be an integer, one with a decimal point but no exponent be an exact rational number (that is, arithmetic operations on it produce mathematically exact results or an exception, except division which must be rounded to some number of decimal places), and one that has an exponent with or without a decimal point be a float: that is, either a rational number on which arithmetic operations do not always produce exact results or one of the special cases -0.0 +Inf -Inf NaN. The letter(s) in the exponent might represent common combinations of <base, range, precision> for floats, like F E Q for binary float32, float64, and float128 respectively, and DF DE DQ for the analogous decimal floats.

Golang takes the view that all integer literals (and compile-time operators applied to them) have arbitrary precision, and only variables have subtypes such as signed int16 or unsigned int32. If the compile-time result is assigned to an integer type too small for it, it is a compile-type error. It has integers and floats but not exact rationals.

Floats, by the way, can be understood mathematically as a finite set of rational intervals, where each interval includes a nominal value and all other rational numbers until halfway to the nominal values of the next smallest and next largest rational interval. However, the float 0.0 extends from just below the smallest positive nominal value down to exact 0, and the float -0.0 extends from just above the largest negative nominal value up to exact 0. Similarly, +Inf and -Inf are intervals extending from (affine) positive infinity down to just above the largest positive nominal value and from (affine) negative infinity up to just below the smallest negative nominal value. Finally, NaN is the union of two intervals, the empty interval (which is why 0.0 / 0.0 is NaN) and the doubly infinite universa interval (which is why +Inf + -Inf is also NaN).

For an Industrial strength D, you need to support many number types. You could go decorating numerical literals with all sorts of hieroglyphics to differentiate the types. Or you could say token 2 is polymorphic: it takes its type from the context, just as +, * can apply for different numeric types. That's also not coercion but a "modest kind of type inference"; without needing 'manifest types', only declared types.

A couple of other approaches likewise seem plausible. As one example: let a numeric literal without a decimal point or exponent be an integer, one with a decimal point but no exponent be an exact rational number (that is, arithmetic operations on it produce mathematically exact results or an exception, except division which must be rounded to some number of decimal places), and one that has an exponent with or without a decimal point be a float: that is, either a rational number on which arithmetic operations do not always produce exact results or one of the special cases -0.0 +Inf -Inf NaN. The letter(s) in the exponent might represent common combinations of <base, range, precision> for floats, like F E Q for binary float32, float64, and float128 respectively, and DF DE DQ for the analogous decimal floats.

Golang takes the view that all integer literals (and compile-time operators applied to them) have arbitrary precision, and only variables have subtypes such as signed int16 or unsigned int32. If the compile-time result is assigned to an integer type too small for it, it is a compile-type error. It has integers and floats but not exact rationals.

Floats, by the way, can be understood mathematically as a finite set of rational intervals, where each interval includes a nominal value and all other rational numbers until halfway to the nominal values of the next smallest and next largest rational interval. However, the float 0.0 extends from just below the smallest positive nominal value down to exact 0, and the float -0.0 extends from just above the largest negative nominal value up to exact 0. Similarly, +Inf and -Inf are intervals extending from (affine) positive infinity down to just above the largest positive nominal value and from (affine) negative infinity up to just below the smallest negative nominal value. Finally, NaN is the union of two intervals, the empty interval (which is why 0.0 / 0.0 is NaN) and the doubly infinite universa interval (which is why +Inf + -Inf is also NaN).

#27 · June 8, 2019, 4:47 am

Quote from johnwcowan on June 8, 2019, 1:39 am

A couple of other approaches likewise seem plausible.

At least. My preference is this.

A number is a zero, or it's a possible minus sign followed by a string of digits containing a single decimal point and no leading or trailing zeros. The digits may be in any base, but base 10 is useful for conceptual purposes.
If the decimal point is in the very last position, it's an integer.
Some of those strings may be represented compactly as binary integers, binary floating point, or otherwise.

This representation is universal and canonical. Implementation is left as a detail to be supplied.

Andl - A New Database Language - andl.org

#28 · June 8, 2019, 7:29 am

Quote from johnwcowan on June 7, 2019, 10:28 pm
If an attribute or variable is of type OPERATOR (i INT, j INT, k INT, l INT) RETURNS INT, it seems a shame to have to write out the signature again in every assignment to that attribute or variable. I realise of course that the signature is required in a literal denoting an operator but I just wondered if the burden could be alleviated in some way (without violating RM Pre 26).

Hugh

I appreciate the desire for brevity, but that would imply a coercion notionally equivalent to being able to perform an assignment like:
VAR x RATIONAL;
x := 2;
Rel doesn't allow that, either. The literal 2 must be 2.0. All literals must be fully specified and their types unambiguously recognisable on sight, so to speak.
As I expected. And even if coercion were supported (which I would definitely oppose) we would need another type for operator body.

Hugh
This is not actually coercion, but a modest kind of type inference, such as Java and C now have: VAR x := 2; declares x and gives it an initial value, but the type of x is inferred by the compiler from the manifest type of the value that initializes it. It is exactly equivalent to the above example, but less verbose, especially when type names are very long.

I'm not sure how that applies to the anonymous operator. It is possible in Tutorial D to use INIT in a manner similar to VAR x := 2, but I intended to highlight the distinction between 2 and 2.0, and coercion would be required if 2 and 2.0 were not otherwise distinct.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org

#29 · June 8, 2019, 2:19 pm

Quote from dandl on June 8, 2019, 4:47 am

Quote from johnwcowan on June 8, 2019, 1:39 am

A couple of other approaches likewise seem plausible.

Some of those strings may be represented compactly as binary integers, binary floating point, or otherwise.

This representation is universal and canonical. Implementation is left as a detail to be supplied.

Unfortunately that's not the case. Indeed, the overloading of the arithmetic operators on INT and RAT arguably contradicts IM 17, with its requirement that overloaded procedures have the same semantics. 1/10 + 1/10 + ... 1/10 is reliably 1, but 0.1 + 0.1 + ... 0.1 is in fact 0.9999999999999999 (though some libraries may incorrectly print 1.0) and not equal to 1.0. As I wrote above, floats are in fact not numbers but highly specific intervals with a separate arithmetic defined on them.

#30 · June 9, 2019, 7:02 am

Quote from johnwcowan on June 8, 2019, 2:19 pm

Some of those strings may be represented compactly as binary integers, binary floating point, or otherwise.

This representation is universal and canonical. Implementation is left as a detail to be supplied.

Unfortunately that's not the case. Indeed, the overloading of the arithmetic operators on INT and RAT arguably contradicts IM 17, with its requirement that overloaded procedures have the same semantics. 1/10 + 1/10 + ... 1/10 is reliably 1, but 0.1 + 0.1 + ... 0.1 is in fact 0.9999999999999999 (though some libraries may incorrectly print 1.0) and not equal to 1.0. As I wrote above, floats are in fact not numbers but highly specific intervals with a separate arithmetic defined on them.

Nothing you say here shows otherwise. Given a single NUMBER type there is no overloading. If you want to emulate integer or floating point arithmetic operations then you have to constrain the result of some operations explicitly, but they are still all the same type. So:

"111." + "222.2" => "333.2". It's up you whether you want an operator that explicitly truncates or rounds the fractional part, but the type never changes.

It is not generally possible to represent fractions exactly as single values. Either you represent them explicitly as fractions (possible simplified) or you live with rounding issues.

As I wrote above, floats are simply a subset of values represented as type NUMBER. If you want operations on them to treat them as intervals then you can do so, but that property is inherent in the operator, not in the value itself.

Andl - A New Database Language - andl.org

The Forum for Discussion about The Third Manifesto and Related Matters