The Forum for Discussion about The Third Manifesto and Related Matters

You need to log in to create posts and topics.

Codd 1970 'domain' does not mean Date 2016 'type' [was: burble about Date's IM]

Quote from Erwin on March 18, 2020, 6:34 pm
Quote from AntC on March 18, 2020, 4:56 am

Addit: What I'm doing is I think licensed by RM Pre 4 b.:

If T is system defined, then zero or more possible representations for values of type T shall be declared and thus made visible in D.

Note the "zero". There is a PhysRep for system-defined Integer; but there are zero PossReps visible in the D. The form #3 is the internal 'storage structure definition language'.

I've always found that "zero possrep" stuff misses a point that should in fact be blatantly obvious : that every type must not only always have a physrep, but also a possrep that has a single CHAR component.

And it is this very possrep that the compiler uses to parse the text string "3" where it appears as a literal (and where from context it is clear that it is supposed to represent an integer) and return the corresponding integer value.

Or put otherwise : the function performed by the compiler is exactly the same function as the one implemented in the corresponding value-selector function implied by the possrep.  And the function performed by the runtime system to render the value on an output device such as a display, is exactly the same function as the "getter" function implied by the possrep.

For primitive types with built-in string literals like 3, 3.4E12, 3.4, TRUE, "blah", etc., sure.

Not necessarily so for user-defined types like TYPE MyType POSSREP {v INT}, unless you mean for MyType to effectively (or conceptually) parse MyType(3) itself?

And likewise for TYPE MyOtherType POSSREP {v MyType} where an example literal is MyOtherType(MyType(3)) ?

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
Quote from Hugh on March 18, 2020, 12:38 pm
Quote from AntC on March 18, 2020, 4:49 am
Quote from dandl on March 18, 2020, 12:55 am

Thanks Hugh, maybe not pre/proscribe, but I see in D&D texts some Pretty Strong Suggestions. Take this from DTATRM for a tuple conforming to the Heading of relvar SP

TUP{ S# S#('S3'), P# P#('P3'), QTY 3}

So I want to write program code like

var S# mySupplier;
mySupplier := 3;
mySuppTup = TUP{ mySupplier, P# 3, QTY 3 };

That is:

  • In a syntactic context expecting a supplier number, I should be able to put a numeric literal, not hang it around with duplicate type annotation.
  • Then prefixing a literal with P# or QTY should be sufficient disambiguation.

Not in general. In this case P# is ambiguous.

No I'm rejecting TTM's duplication of both a Selector P#( ) and an attribute name P#. Pick one only. And don't decorate Supplier numbers with letters, unless you want an alphabetic code in which 'P3' /= 'P0003' and Supplier 'P3' is sorted between 'P2999', 'P30'.

Take this case, what do you write for this?

TUP{ S# S#('S3'), parentP# P#('P3'), childP# P#('P3'), childQTY 3}

No I don't want to write that. Triplication.

Did your language definition:

I didn't give any declaration forms, but let me posit these, for the sake of illustration:

DOMAIN S#  Integer;          // same PhysRep as Integer, different Nominative type
DOMAIN QTY Integer;
DOMAIN P#  Integer;

ROLE parentP# P#;           // same PhysRep as P#, em 'comparable' Nominative type
ROLE childP#  P#;

ROLE childQTY QTY;

TUP{ S# 3, parentP# 3, childP# 3, childQTY 3 }

 

  1. Read 'P#' as an attribute name, '3' as an integer, and supply an implicit conversion to type P# (being the type of attribute P#)?
  2. Read 'P#' as an attribute name, '3' as a value in a specialised context provided by type P# (being the type of attribute P#)?
  3. Read 'P#' as a type name, '3' as an integer, convert the integer to a value of type P# and assign it to attribute P# positionally?

I prefer (a), but others may differ. Implicit non-lossy ('widening') conversions are safe in general, but are not for the purists.

 

No I don't see any coercions here. I can say that because to the extent I can emulate this in Haskell, also there's no coercions. I'm using the sort of modern type theory Dave wanted me to use; but I fear nobody round here is actually up with the play. So round here you want to both reject the old stuff (Codd 1970) and ignore the new stuff (not actually new: polymorphism/overloading is in Strachey 1967, brought up to date in the ML family of language from 1980's, specifically Wadler 1989). I say that because I explained all this before and it fell on deaf ears.

Firstly let me reiterate that it's unusual to express values to be put in the database as literals in a program. Usually values come in from the database or from screen entry as already of the appropriate type. To make stark that this is not coercion, this is type-invalid:

var Integer threeInt;
threeInt := 3;              // accepted, not a coercion

var S# mySupplier;
mySupplier := threeInt;     // rejected type mis-match
mySupplier := 3;            // accepted, not a coercion

How does that work? Program token appearing as 3 is not Integer 3 in the FORTRAN/ALGOL/PL/1 sense. (It actually should be understandable as denoting an object with access methods in the OOP sense.) It's shorthand for a polymorphic expression fromInteger(#3), where #3 denotes the PhysRep for Integer 3, and fromInteger( ) is a built-in access method (compare toString( ), fromString( )). Perhaps you might even write it #3.fromInteger.

fromInteger( ) is polymorphic/overloaded. It's a method that takes an argument of type Integer, and returns a result of the type demanded by its program context -- i.e. the overloading. (For example, the context might demand a ShortInt or a Float or a Rational.) In cases where the context is demanding a Nominative type (DOMAIN declaration) based on Integer -- i.e. with same PhysRep as Integer, that's a no-op whose only purpose is to make the types match up.

mySupplier := threeInt; is a type mis-match because the earlier assignment to threeInt has already 'executed' the no-op fromInteger(#3), so threeInt is not polymorphic. Of course the compiler, after inferring the type of the context, should 'execute' the fromInteger( ) at compile time; so there's no run-time cost here.

If I understand correctly, a domain is a UDT and an attribute whose name is a domain name, or the name of a role, has that domain, or the domain on which that role is defined, as its type.

No doubt such schemes can be made to work but my objection to them has always been the care that is needed to specify attribute names to avoid unsuspected clashes.   In TD tup{x 1} denotes a tuple whose attribute x is of type integer (equivalent to tup{x int}{x 1}).

Hugh, I put it to you that you never write bare TUP{ x 1 } in even the most ad hoc of ad hoc queries.

Attribute names appearing in queries are either from the database, or RENAME/EXTEND/SUMMARIZE/GROUP/etc of attributes from the database. As such, the compiler can always infer (what I'm calling) the DOMAIN/ROLE of an introduced attribute.

It won't work as desired if a domain or role x is already defined.

If you're writing TUP{ x 1 } (which I doubt), it's to JOIN/etc with a database attribute or query result from the database. So if x is already defined, you are deliberately using the same name and you deliberately want it to be at the same type/DOMAIN/ROLE.

Conversely, DOMAIN x char won't work if the database already has x used for an attribute somewhere, and it doesn't even have to be for an attribute in a base or virtual relvar -- it could be in some expression used in an operator or virtual relvar definition, for example.  Does that objection apply here?  Or am I missing something?

 

I think you're fabulating about how you write queries. If an ad hoc query has a RENAME/etc, I think the compiler could on the fly generate a ROLE or DOMAIN declaration with a SAME_ROLE_AS etc. Posssibly you're introducing an ad hoc attribute x like that, unaware that there's already a database attribute x, and you're unaware because it's in a relvar you're not accessing in this query. I'm sure we can figure some scoping rules such that the introduced x shadows the database x.

Quote from Dave Voorhis on March 18, 2020, 11:39 pm
Quote from Erwin on March 18, 2020, 6:34 pm
Quote from AntC on March 18, 2020, 4:56 am

Addit: What I'm doing is I think licensed by RM Pre 4 b.:

If T is system defined, then zero or more possible representations for values of type T shall be declared and thus made visible in D.

Note the "zero". There is a PhysRep for system-defined Integer; but there are zero PossReps visible in the D. The form #3 is the internal 'storage structure definition language'.

I've always found that "zero possrep" stuff misses a point that should in fact be blatantly obvious : that every type must not only always have a physrep, but also a possrep that has a single CHAR component.

That doesn't sound right, and certainly not "blatantly obvious" -- perhaps you're expressing it badly.

Type Point has two components, neither of type CHAR. Just because you can represent component(s) as text in your programming language doesn't make them CHAR. My interpretation is that every representable type has a toString( ) method and a fromString( ) that is the inverse s.t. v == fromString(toString(v)), and it's the String that appears in program source.

And it is this very possrep that the compiler uses to parse the text string "3" where it appears as a literal (and where from context it is clear that it is supposed to represent an integer) and return the corresponding integer value.

Or put otherwise : the function performed by the compiler is exactly the same function as the one implemented in the corresponding value-selector function implied by the possrep.

No the TTM Selector doesn't take a CHAR argument; it takes a INT/RAT/etc. The Selector is usually applied to values of that type coming from database expressions or user input. As I keep saying, it's unusual to put value literals to represent database content in program text . TTM requires the ability, to make sure every value is representable; because the way to get a SQL Null-marked column is to fail to represent any value when inserting data into a table.

  And the function performed by the runtime system to render the value on an output device such as a display, is exactly the same function as the "getter" function implied by the possrep.

For primitive types with built-in string literals like 3, 3.4E12, 3.4, TRUE, "blah", etc., sure.

Not necessarily so for user-defined types like TYPE MyType POSSREP {v INT}, unless you mean for MyType to effectively (or conceptually) parse MyType(3) itself?

And likewise for TYPE MyOtherType POSSREP {v MyType} where an example literal is MyOtherType(MyType(3)) ?

I'm not sure whether Dave is addressing only Erwin's point; or also mine. I don't intend that DOMAINs/ROLEs are the same as types; they're "based on" types, or some such wording.

Quote from Dave Voorhis on March 18, 2020, 11:39 pm
Quote from Erwin on March 18, 2020, 6:34 pm
Quote from AntC on March 18, 2020, 4:56 am

Addit: What I'm doing is I think licensed by RM Pre 4 b.:

If T is system defined, then zero or more possible representations for values of type T shall be declared and thus made visible in D.

Note the "zero". There is a PhysRep for system-defined Integer; but there are zero PossReps visible in the D. The form #3 is the internal 'storage structure definition language'.

I've always found that "zero possrep" stuff misses a point that should in fact be blatantly obvious : that every type must not only always have a physrep, but also a possrep that has a single CHAR component.

And it is this very possrep that the compiler uses to parse the text string "3" where it appears as a literal (and where from context it is clear that it is supposed to represent an integer) and return the corresponding integer value.

Or put otherwise : the function performed by the compiler is exactly the same function as the one implemented in the corresponding value-selector function implied by the possrep.  And the function performed by the runtime system to render the value on an output device such as a display, is exactly the same function as the "getter" function implied by the possrep.

For primitive types with built-in string literals like 3, 3.4E12, 3.4, TRUE, "blah", etc., sure.

Not necessarily so for user-defined types like TYPE MyType POSSREP {v INT}, unless you mean for MyType to effectively (or conceptually) parse MyType(3) itself?

And likewise for TYPE MyOtherType POSSREP {v MyType} where an example literal is MyOtherType(MyType(3)) ?

"Zero possreps" is only allowed for system-defined types.

I might have put it badly in the sense that imo "zero possreps" in fact means that there is an implicit possrep with a single CHAR component, which then means that the "zero" is indeed a delusion.  That leaves only CHAR to deal with (if you define the single-component CHAR possrep for that one then that gets you conceptually in an infinite loop both for writing a value selector and for parsing a written one, i.e. if you want to write a value selector for CHAR then you need to write the value selector for its CHAR component etc. etc. and ditto for the parsing process).

 

Quote from AntC on March 19, 2020, 7:54 am

 

And it is this very possrep that the compiler uses to parse the text string "3" where it appears as a literal (and where from context it is clear that it is supposed to represent an integer) and return the corresponding integer value.

Or put otherwise : the function performed by the compiler is exactly the same function as the one implemented in the corresponding value-selector function implied by the possrep.

No the TTM Selector doesn't take a CHAR argument; it takes a INT/RAT/etc. The Selector is usually applied to values of that type coming from database expressions or user input. As I keep saying, it's unusual to put value literals to represent database content in program text . TTM requires the ability, to make sure every value is representable; because the way to get a SQL Null-marked column is to fail to represent any value when inserting data into a table.

See addit : the remark was concerned only with system-defined scalar types (such as INT).  And for such types, yes the value selector does take a CHAR argument.  If there were no argument then it would be impossible to select more than one distinct value of the type.  Ot the value selector function is a function of black magic.  I didn't think we were into that art form.

(To dispense with issues of physical encoding of the CHAR itself, define a CHAR as a variable-length array of UCS code point numbers.  The INT value selector function is the function that maps e.g. the array [49, 48] to the integer value 10.  And that (partial) function has CHAR as its domain, so in computer terms the argument is of type CHAR.)

Quote from AntC on March 19, 2020, 12:10 am

Hugh, I put it to you that you never write bare TUP{ x 1 } in even the most ad hoc of ad hoc queries.

How dare you make such an accusation!  Here's the very first saved Rel script that I looked at, knowing that in fact I make extensive use of tuple literals.

/* PCNoRec% of cases with at least one operand = Digit have no recalcitrants */
rel{
tup{Digit 1, PCNoRecs Percent(Count((Studied join CasesWith1) not matching Recalcitrant), count(Studied join CasesWith1))},
tup{Digit 2, PCNoRecs Percent(Count((Studied join CasesWith2) not matching Recalcitrant), count(Studied join CasesWith2))},
tup{Digit 3, PCNoRecs Percent(Count((Studied join CasesWith3) not matching Recalcitrant), count(Studied join CasesWith3))},
tup{Digit 4, PCNoRecs Percent(Count((Studied join CasesWith4) not matching Recalcitrant), count(Studied join CasesWith4))},
tup{Digit 5, PCNoRecs Percent(Count((Studied join CasesWith5) not matching Recalcitrant), count(Studied join CasesWith5))},
tup{Digit 6, PCNoRecs Percent(Count((Studied join CasesWith6) not matching Recalcitrant), count(Studied join CasesWith6))},
tup{Digit 7, PCNoRecs Percent(Count((Studied join CasesWith7) not matching Recalcitrant), count(Studied join CasesWith7))},
tup{Digit 8, PCNoRecs Percent(Count((Studied join CasesWith8) not matching Recalcitrant), count(Studied join CasesWith8))},
tup{Digit 9, PCNoRecs Percent(Count((Studied join CasesWith9) not matching Recalcitrant), count(Studied join CasesWith9))}
}
order(asc PCNoRecs)

Please let me know if you want to any more.  No, on second thoughts, don't.

Hugh

Coauthor of The Third Manifesto and related books.
Quote from Erwin on March 19, 2020, 11:13 am
Quote from Dave Voorhis on March 18, 2020, 11:39 pm
Quote from Erwin on March 18, 2020, 6:34 pm
Quote from AntC on March 18, 2020, 4:56 am

Addit: What I'm doing is I think licensed by RM Pre 4 b.:

If T is system defined, then zero or more possible representations for values of type T shall be declared and thus made visible in D.

Note the "zero". There is a PhysRep for system-defined Integer; but there are zero PossReps visible in the D. The form #3 is the internal 'storage structure definition language'.

I've always found that "zero possrep" stuff misses a point that should in fact be blatantly obvious : that every type must not only always have a physrep, but also a possrep that has a single CHAR component.

And it is this very possrep that the compiler uses to parse the text string "3" where it appears as a literal (and where from context it is clear that it is supposed to represent an integer) and return the corresponding integer value.

Or put otherwise : the function performed by the compiler is exactly the same function as the one implemented in the corresponding value-selector function implied by the possrep.  And the function performed by the runtime system to render the value on an output device such as a display, is exactly the same function as the "getter" function implied by the possrep.

For primitive types with built-in string literals like 3, 3.4E12, 3.4, TRUE, "blah", etc., sure.

Not necessarily so for user-defined types like TYPE MyType POSSREP {v INT}, unless you mean for MyType to effectively (or conceptually) parse MyType(3) itself?

And likewise for TYPE MyOtherType POSSREP {v MyType} where an example literal is MyOtherType(MyType(3)) ?

"Zero possreps" is only allowed for system-defined types.

I might have put it badly in the sense that imo "zero possreps" in fact means that there is an implicit possrep with a single CHAR component, which then means that the "zero" is indeed a delusion.  That leaves only CHAR to deal with (if you define the single-component CHAR possrep for that one then that gets you conceptually in an infinite loop both for writing a value selector and for parsing a written one, i.e. if you want to write a value selector for CHAR then you need to write the value selector for its CHAR component etc. etc. and ditto for the parsing process).

 

I see what you mean about "zero possreps".

I didn't understand it before.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
Quote from Hugh on March 19, 2020, 12:39 pm
Quote from AntC on March 19, 2020, 12:10 am

Hugh, I put it to you that you never write bare TUP{ x 1 } in even the most ad hoc of ad hoc queries.

How dare you make such an accusation!  Here's the very first saved Rel script that I looked at, knowing that in fact I make extensive use of tuple literals.

/* PCNoRec% of cases with at least one operand = Digit have no recalcitrants */
rel{
tup{Digit 1, PCNoRecs Percent(Count((Studied join CasesWith1) not matching Recalcitrant), count(Studied join CasesWith1))},
...
}
order(asc PCNoRecs)

Please let me know if you want to any more.  No, on second thoughts, don't.

I agree that code is in no way easy on the eye. Presumably it's within a finger-slip to get

tup{Digit 1, PCNoRecs Percent(Count((Studied join CasesWith2) not matching Recalcitrant), count(Studied join CasesWith3))}

I'd count this amongst my earlier comments "I do want to make the coding sufficiently verbose for the programmer to stop and ask themselves whether they might be doing something daft." You are doing something daft.

  • I think my "accusation" stands: that expression is not a query but data entry.
  • I said "If you're writing TUP{ x 1 } , it's to JOIN/etc with a database attribute or query result from the database"; and that is indeed what's going on: Digit 1 is joining to the 1 in CasesWith1, which presumably deeply buried within its content has a Digit 1 (or perhaps a CHAR with a 1 amongst others).
  • I'm interested what limits to expressivity in Rel you're bumping into such that you're encoding data content into the name of a relvar. Does that tell us there's something wrong with the RM, or merely wrong with your schema design?
  • Can you for example in Rel get the relvar name from the catalogue (or all of the form CasesWithn), cast it to CHAR, extract the last character, cast that to INT? Can you at the same time put that relvar name into a query expression to obtain your Percent and Count statistics? You'll have seen complaints on the forum that TTM 's relvar names (and attribute names) are not first-class in that you can't pass them as arguments to functions. You seem to have found the perfect illustration of the need -- that is, if there's not something wrong with your schema.

 

Lastly (and I nearly included this in my 'specification'), if the x 1 is not Joining/etc to any other attribute x or RENAMEd to x coming from the database, etc; then it doesn't matter what DOMAIN for the 1 then it might as well default to Integer. (This also is behaviour already supported in Haskell.)

(To dispense with issues of physical encoding of the CHAR itself, define a CHAR as a variable-length array of UCS code point numbers.  The INT value selector function is the function that maps e.g. the array [49, 48] to the integer value 10.  And that (partial) function has CHAR as its domain, so in computer terms the argument is of type CHAR.)

This is a misunderstanding/misrepresentation of what a compiler does. The token 10 is recognised in the text of the program by reference to some kind of specification and assigned to a token type very early in the process. In a language like C the tokens 10, 0x0a and 012 are all tokens of type integer with the value of 10, but the token "10" is of type string. The function as defined by the specification for the compiler in the case of SomeSelector(10) has a domain of integer, while SomeSelector("10") has a domain of character. They should not be confused.

FWIW a compiler will only ever define a handful of token types (eg 10, 10.0, 10e0, 10d, "10", '10', @"10", $'10', X10, etc) even if it has countless types in its type system.

Andl - A New Database Language - andl.org
Quote from AntC on March 19, 2020, 9:10 pm
Quote from Hugh on March 19, 2020, 12:39 pm
Quote from AntC on March 19, 2020, 12:10 am

Hugh, I put it to you that you never write bare TUP{ x 1 } in even the most ad hoc of ad hoc queries.

How dare you make such an accusation!  Here's the very first saved Rel script that I looked at, knowing that in fact I make extensive use of tuple literals.

/* PCNoRec% of cases with at least one operand = Digit have no recalcitrants */
rel{
tup{Digit 1, PCNoRecs Percent(Count((Studied join CasesWith1) not matching Recalcitrant), count(Studied join CasesWith1))},
...
}
order(asc PCNoRecs)

Please let me know if you want to any more.  No, on second thoughts, don't.

I agree that code is in no way easy on the eye. Presumably it's within a finger-slip to get

tup{Digit 1, PCNoRecs Percent(Count((Studied join CasesWith2) not matching Recalcitrant), count(Studied join CasesWith3))}

I'd count this amongst my earlier comments "I do want to make the coding sufficiently verbose for the programmer to stop and ask themselves whether they might be doing something daft." You are doing something daft.

 

Am I being too cruel in critiquing Hugh's schema and opining there's something daft? TTM-ers do this all the time with others' schemas: typically somebody has gone deep down a rabbit hole with a schema relying on Nullable columns, and then asks how to express a query that seems impossible. We had on the forum recently an example 'Company Cars' of a schema where the designers head was full of Nulls and SQL's inability to express Exclusion Dependencies. It sometimes takes a lot of careful explaining to back the designer out of the rabbit hole.

So I can only guess how Hugh got down this rabbit hole. Why on earth are there nine separate relvars with (presumably) the same schema, differing only in the relvar name? Why on earth is there any expression at all with nine near-identical lines of code differing only in which relvar name they're accessing? Clearly Digit is reference data in this application (dare I say a DOMAIN) and the nine Digits should be in a reference relvar (yes even though we all know what the nine digits are). Clearly whatever these CasesWithn relvars are, their content should be in one relvar with an extra attribute Digit. Clearly that 9-line expression should be written as one line with the Digit drawn from the Digit reference relvar, and Joined to the CasesWithn relvar.

I'll speculate (remembering from when Hugh earlier described his 'four fours' experiment) there's a relvar holding formulas as CHAR. A formula might include many appearances of some digit(s). Then CasesWithn should be a Virtual:

CasesWithn := (Digits TIMES Cases) WHERE Cast_to_CHAR( Digit ) isSubStringOf( Formula );

And Hugh's ugly nine lines should be a straightforward SUMMARIZE CasesWithn ... GROUP BY Digit ....

I'll speculate further that Hugh originally designed a schema purely for the 'four fours' cases, where there was no need for a Digits reference relvar, because it would only have contained a four. So when he expanded the exercise to other digits, he merely cloned the Cases schema to CasesWith3, CasesWith2, etc.