Operators on tuples/'Expressively complete'/RM Pre 6

#21 · October 1, 2022, 12:25 pm

Quote from Erwin on October 1, 2022, 12:25 pm

Quote from AntC on October 1, 2022, 10:18 am

Note ** "THE_VALUE( ) from a singleton tuple anonymously". I believe Tutorial D allows this -- or at least allows it to apply to relation values. But then also acknowledges that might fail at run-time because the relation value is not a singleton. Since my proposed operators are polymorphic (don't need to mention specific attribute names or fully stipulate Headings); I can't guarantee the result from a nesting of operators is a tuple with a single attribute. (And therefore also don't know the anonymous attribute(s) type(s)) So I'm not supporting THE_VALUE( ). Is that an insurmountable limitation?

I would certainly not regard it as such. After decades of using system defaults, and similar language concepts, too eagerly only to have them come back at me with a vengeance, I am very much on the side that programmers must be explicit about everything. And I welcome languages that force this attitude onto them more than I welcome languages that allow them to continue to be lax (repeating my own past mistakes).

(BTW "TUPLE FROM <relation>" does not destroy type inference, which is what makes it acceptable as far as Tutorial D goes (and other languages that accept runtime exceptions as an inevitable fact of life).

EDIT : "ATTRIBUTE FROM <tuple>" (as a Tutorial-D-like syntax for the operator under consideration) does not destroy type inference either, for the very same reason, as a matter of fact, given that if the tuple type is known then either :

- the degree is known to be not equal to one, and a compile-time error can be raised (as opposed to the TUPLE FROM case, where it's not degree but cardinality that's supposed to be equal to one, and that's necessarily a runtime datum)

- the degree is known to be equal to one, and then the return type of the "ATTRIBUTE FROM" expression is known to be necessarily the same as that of the sole attribute in question.

That said, I still don't see a practical case for "ATTRIBUTE FROM" because everything that is known to the compiler can also be known to the programmer, and per my expressed preference, I like programmers having to be explicit about everything. In practice : a maintenance programmer seeing "ATTRIBUTE FROM" must now turn to the tuple declaration to know what the return type is. That declaration is not necessarily on the same page of code as the "ATTRIBUTE FROM" reference (or must be "mentally manually calculated" by said maintenance programmer if it's a complex tuple-typed expression). Either option will slow down the maintenance programmer's operation (work speed) significantly.

Quote from AntC on October 1, 2022, 10:18 am

Note ** "THE_VALUE( ) from a singleton tuple anonymously". I believe Tutorial D allows this -- or at least allows it to apply to relation values. But then also acknowledges that might fail at run-time because the relation value is not a singleton. Since my proposed operators are polymorphic (don't need to mention specific attribute names or fully stipulate Headings); I can't guarantee the result from a nesting of operators is a tuple with a single attribute. (And therefore also don't know the anonymous attribute(s) type(s)) So I'm not supporting THE_VALUE( ). Is that an insurmountable limitation?

I would certainly not regard it as such. After decades of using system defaults, and similar language concepts, too eagerly only to have them come back at me with a vengeance, I am very much on the side that programmers must be explicit about everything. And I welcome languages that force this attitude onto them more than I welcome languages that allow them to continue to be lax (repeating my own past mistakes).

(BTW "TUPLE FROM <relation>" does not destroy type inference, which is what makes it acceptable as far as Tutorial D goes (and other languages that accept runtime exceptions as an inevitable fact of life).

EDIT : "ATTRIBUTE FROM <tuple>" (as a Tutorial-D-like syntax for the operator under consideration) does not destroy type inference either, for the very same reason, as a matter of fact, given that if the tuple type is known then either :

- the degree is known to be not equal to one, and a compile-time error can be raised (as opposed to the TUPLE FROM case, where it's not degree but cardinality that's supposed to be equal to one, and that's necessarily a runtime datum)

- the degree is known to be equal to one, and then the return type of the "ATTRIBUTE FROM" expression is known to be necessarily the same as that of the sole attribute in question.

That said, I still don't see a practical case for "ATTRIBUTE FROM" because everything that is known to the compiler can also be known to the programmer, and per my expressed preference, I like programmers having to be explicit about everything. In practice : a maintenance programmer seeing "ATTRIBUTE FROM" must now turn to the tuple declaration to know what the return type is. That declaration is not necessarily on the same page of code as the "ATTRIBUTE FROM" reference (or must be "mentally manually calculated" by said maintenance programmer if it's a complex tuple-typed expression). Either option will slow down the maintenance programmer's operation (work speed) significantly.

Author of SIRA_PRISE

#22 · October 1, 2022, 1:22 pm

Quote from AntC on October 1, 2022, 10:18 am

I think the position within TTM is that attribute names are not values, and vice versa.

I think that's a mischaracterization, and thinking a bit further, I think it must even necessarily be wrong. If a system does not support attribute names as values of a type, then what type is going to be used in the catalog to document everything related to relvars and their names, attributes and their names, types and their names, operators and their names, etc. etc ? So on the contrary, I think it is inevitable that attribute names must be dealt with (at least at times and in certain contexts) as values of some NAME type. It's just that there's no room for "arbitrary expressions of type NAME" in certain contexts where "literals of type NAME" are [and indeed must be] allowed. And if you must do it "at least at times and in certain contexts", then you can really just as well do it (treat names as values of a type) at the other times and in the other contexts too. It's not that doing so changes anything to the significance of "being a name".

Author of SIRA_PRISE

#23 · October 1, 2022, 11:43 pm

Quote from Erwin on October 1, 2022, 1:22 pm

Quote from AntC on October 1, 2022, 10:18 am

I think the position within TTM is that attribute names are not values, and vice versa.

I think that's a mischaracterization, and thinking a bit further, I think it must even necessarily be wrong. If a system does not support attribute names as values of a type, then what type is going to be used in the catalog to document everything related to relvars and their names, attributes and their names, types and their names, operators and their names, etc. etc ?

And I think that's a misunderstanding of the nature and role of TTM. While TTM/D is Turing Complete and sufficient for any application programming purpose it is not required to fulfil the role of a systems programming language. There are multiple aspects of the spec that require or imply the existence of other language features operating at the physical or implementation level doing things that TTM does not permit.

Specifically, a practical TTM/D might be programmed in Java, and access to attribute names as values might only be exposed as a Java API.

Andl - A New Database Language - andl.org

#24 · October 2, 2022, 9:44 am

Quote from dandl on October 1, 2022, 11:43 pm

Quote from Erwin on October 1, 2022, 1:22 pm

Quote from AntC on October 1, 2022, 10:18 am

I think the position within TTM is that attribute names are not values, and vice versa.

I think that's a mischaracterization, and thinking a bit further, I think it must even necessarily be wrong. If a system does not support attribute names as values of a type, then what type is going to be used in the catalog to document everything related to relvars and their names, attributes and their names, types and their names, operators and their names, etc. etc ?

And I think that's a misunderstanding of the nature and role of TTM. While TTM/D is Turing Complete and sufficient for any application programming purpose it is not required to fulfil the role of a systems programming language. There are multiple aspects of the spec that require or imply the existence of other language features operating at the physical or implementation level doing things that TTM does not permit.

Specifically, a practical TTM/D might be programmed in Java, and access to attribute names as values might only be exposed as a Java API.

I said nothing about the particulars of "begin a systems programming language". I observed there is a requirement for having a catalog, that that catalog is supposed to document which attributes participate in which relvars, and that therefore there is an inevitable need for a type, the values of which represent attribute names and relvar names. No doubt you are going to say "String (or CHAR) will do" but imo that is flawed design. Not all valid CHAR values will be valid attribute/relvar names.

Author of SIRA_PRISE

#25 · October 2, 2022, 12:09 pm

Specifically, a practical TTM/D might be programmed in Java, and access to attribute names as values might only be exposed as a Java API.

I said nothing about the particulars of "begin a systems programming language". I observed there is a requirement for having a catalog, that that catalog is supposed to document which attributes participate in which relvars, and that therefore there is an inevitable need for a type, the values of which represent attribute names and relvar names. No doubt you are going to say "String (or CHAR) will do" but imo that is flawed design. Not all valid CHAR values will be valid attribute/relvar names.

Again I say no. There is indeed a requirement for a catalog, and the catalog does indeed define the heading for each relvar, but if the catalog exposes that information in the form of relations, that exposed name will by necessity be the same string data type as any other.

But the means by which a new entry is made in the catalog is a <database relation var def> (see TD p12) which in turn is based on a <relation type spec>. Any valid name that satisfies the TTM/D compiler is a valid attribute name.

The 'systems programming language' is required to close the loop from the TTM/D compiled relvar type spec to the entry in the catalog. Constructing relvar types in which attribute names are variables is not part of TTM, perhaps with good reason.

Andl - A New Database Language - andl.org

#26 · October 2, 2022, 12:38 pm

Quote from dandl on October 2, 2022, 12:09 pm

if the catalog exposes that information in the form of relations, that exposed name will by necessity be the same string data type as any other.

...

Constructing relvar types in which attribute names are variables is not part of TTM, perhaps with good reason.

"By necessity" ... Please show the necessity. Or perhaps no, don't bother. You're just hopeless anyway. The set of valid names is a proper subset of the set of valid strings. Therefore, using type CHAR for working with things of which it is known they can only be valid names, is a dreadful design mistake. Not that I'm surprised you'd be making it the first occasion you get, of course.

"Relvar types in which attribute names are variables" ... Where the hell have you seen anyone talking about any such concept ?

Author of SIRA_PRISE

#27 · October 2, 2022, 12:56 pm

Quote from dandl on October 2, 2022, 12:09 pm

But the means by which a new entry is made in the catalog is a <database relation var def> (see TD p12) which in turn is based on a <relation type spec>. Any valid name that satisfies the TTM/D compiler is a valid attribute name.

You are talking about Tutorial D. I am talking about TTM.

And "satisfies the TTM/D compiler" is a predicate that establishes the NAME type of the language at hand. This is so regardless of what the rules are for "satisfying the TTM/D compiler". The compiler has a NAME type whether you like it or not. Not using it in the catalog and not exposing it as such in the catalog is just plain stupid.

The effects you'd be getting if users start querying the catalog to find, e.g. all the relvars that have an attribute named 'RELVARNAME' (the query might be something like 'RELVARATTRIBUTE WHERE ATTRIBUTENAME = "RELVARNAME" {RELVARNAME}' are as follows :

if type CHAR is used for the name-attributes and the user queries with an invalid name then the only result he'd be getting back, is the answer "no such relvars found"
but if a NAME type is used for the name-attributes and the user queries with an invalid name then his WHERE clause would have to be something like WHERE ATTRIBUTENAME = NAME("RELVARNAME") and if "RELVARNAME" were an invalid name then this would give rise to the value selector invocation raising an error perhaps saying "invalid name RELVARNAME specified".

Which of the two do you think is the best option ?

Author of SIRA_PRISE

#28 · October 2, 2022, 1:32 pm

Quote from dandl on October 2, 2022, 1:32 pm

Quote from Erwin on October 2, 2022, 12:56 pm

Quote from dandl on October 2, 2022, 12:09 pm

But the means by which a new entry is made in the catalog is a <database relation var def> (see TD p12) which in turn is based on a <relation type spec>. Any valid name that satisfies the TTM/D compiler is a valid attribute name.

You are talking about Tutorial D. I am talking about TTM.

I am talking about a TTM/D compiler, of which an extant example is TD. TM Pre 7 makes it clear that a relation type is something generated at compile time from a heading (RM Pre 9), which defines the attribute name A. Attributes can be whatever you like, as long as they're distinct and the compiler accepts them. Numbers? Emojii? Pictographs?

And "satisfies the TTM/D compiler" is a predicate that establishes the NAME type of the language at hand. This is so regardless of what the rules are for "satisfying the TTM/D compiler". The compiler has a NAME type whether you like it or not. Not using it in the catalog and not exposing it as such in the catalog is just plain stupid.

No, not at all. "satisfies the TTM/D compiler" is a lexical requirement, not a type system requirement. The only mention of the catalog is RM Pre 25, and that sets no requirements as to attribute names. It would be perfectly possible (for example) for the compiled source code to use English attribute names and the catalog to translate them all according to the local preferences, Greek for example. You may think it's stupid, but it's certainly permitted.

The effects you'd be getting if users start querying the catalog to find, e.g. all the relvars that have an attribute named 'RELVARNAME' (the query might be something like 'RELVARATTRIBUTE WHERE ATTRIBUTENAME = "RELVARNAME" {RELVARNAME}' are as follows :

if type CHAR is used for the name-attributes and the user queries with an invalid name then the only result he'd be getting back, is the answer "no such relvars found"

but if a NAME type is used for the name-attributes and the user queries with an invalid name then his WHERE clause would have to be something like WHERE ATTRIBUTENAME = NAME("RELVARNAME") and if "RELVARNAME" were an invalid name then this would give rise to the value selector invocation raising an error perhaps saying "invalid name RELVARNAME specified".

Which of the two do you think is the best option ?

I said before that I would expect a catalog that exposes its contents as relations would do so in terms of strings, but that places no obligations on the compiler to accept the same strings or indeed to provide any formal type for attributes.

Quote from Erwin on October 2, 2022, 12:56 pm

Quote from dandl on October 2, 2022, 12:09 pm

But the means by which a new entry is made in the catalog is a <database relation var def> (see TD p12) which in turn is based on a <relation type spec>. Any valid name that satisfies the TTM/D compiler is a valid attribute name.

You are talking about Tutorial D. I am talking about TTM.

I am talking about a TTM/D compiler, of which an extant example is TD. TM Pre 7 makes it clear that a relation type is something generated at compile time from a heading (RM Pre 9), which defines the attribute name A. Attributes can be whatever you like, as long as they're distinct and the compiler accepts them. Numbers? Emojii? Pictographs?

And "satisfies the TTM/D compiler" is a predicate that establishes the NAME type of the language at hand. This is so regardless of what the rules are for "satisfying the TTM/D compiler". The compiler has a NAME type whether you like it or not. Not using it in the catalog and not exposing it as such in the catalog is just plain stupid.

No, not at all. "satisfies the TTM/D compiler" is a lexical requirement, not a type system requirement. The only mention of the catalog is RM Pre 25, and that sets no requirements as to attribute names. It would be perfectly possible (for example) for the compiled source code to use English attribute names and the catalog to translate them all according to the local preferences, Greek for example. You may think it's stupid, but it's certainly permitted.

The effects you'd be getting if users start querying the catalog to find, e.g. all the relvars that have an attribute named 'RELVARNAME' (the query might be something like 'RELVARATTRIBUTE WHERE ATTRIBUTENAME = "RELVARNAME" {RELVARNAME}' are as follows :

if type CHAR is used for the name-attributes and the user queries with an invalid name then the only result he'd be getting back, is the answer "no such relvars found"

but if a NAME type is used for the name-attributes and the user queries with an invalid name then his WHERE clause would have to be something like WHERE ATTRIBUTENAME = NAME("RELVARNAME") and if "RELVARNAME" were an invalid name then this would give rise to the value selector invocation raising an error perhaps saying "invalid name RELVARNAME specified".

Which of the two do you think is the best option ?

I said before that I would expect a catalog that exposes its contents as relations would do so in terms of strings, but that places no obligations on the compiler to accept the same strings or indeed to provide any formal type for attributes.

Andl - A New Database Language - andl.org

#29 · October 2, 2022, 2:39 pm

Which of the two do you think is the best option ?

Author of SIRA_PRISE

#30 · October 3, 2022, 5:05 am

Quote from AntC on October 3, 2022, 5:05 am

Quote from Erwin on October 1, 2022, 1:22 pm

Quote from AntC on October 1, 2022, 10:18 am

I think the position within TTM is that attribute names are not values, and vice versa.

I think that's a mischaracterization, and thinking a bit further, I think it must even necessarily be wrong.

We've circled round this issue plenty times without coming to any definite conclusion. When it comes to (Tutorial) D I'll leave the question to Hugh. (IMO the ontology of a D is sufficiently unlike the programming languages I'm familiar with, I'd be guessing.)

If a system does not support attribute names as values of a type, then what type is going to be used in the catalog to document everything related to relvars and their names, attributes and their names, types and their names, operators and their names, etc. etc ?

I mentioned an Industrial D could give you the print-name of attributes/relvars as CHAR -- then that; or a type that is an alias for CHAR.

Note that we can't from within a D ask if two attributes have the same name; or specify to get the attribute-same-name-as ATTRIBUTE FROM <tuple> from some other tuple. No variables ranging over attribute name. Any query must give the attribute name as a (kinda) literal. If you're going to say each attribute name is a distinct (something); how could one column in the catalog hold all those possible (something)s?

So on the contrary, I think it is inevitable that attribute names must be dealt with (at least at times and in certain contexts) as values of some NAME type. It's just that there's no room for "arbitrary expressions of type NAME" in certain contexts where "literals of type NAME" are [and indeed must be] allowed. And if you must do it "at least at times and in certain contexts", then you can really just as well do it (treat names as values of a type) at the other times and in the other contexts too. It's not that doing so changes anything to the significance of "being a name".

("some NAME type" sounds rather hand-wavey.) Are you to allow variables whose value is some NAME type? Are you to allow that variable to appear in an expression in the syntactic position of an attribute name, expecting 'the compilation system' will substitute in its current value? Are you to allow an iteration that destructively assigns different NAME values to that variable then re-evaluate that expression?

I don't have answers/that's why I'm not proposing 'Expressively complete' includes that sort of ability. (I think I'm already allowing greater expressivity by allowing 'tuples/relations to stand for headings' in a project/remove. If you want a 'variable' to range over different attribute names: use a tuplevar; assign differently-headinged tuples to it; now quick what's the type of that tuplevar?)

@Dandl Specifically, a practical TTM/D might be programmed in Java, and access to attribute names as values might only be exposed as a Java API.

Yeah, my take (trying to generalise as to what might be the implementation language) is that attribute names are more like types than values; but they don't participate in the type system for scalar values. We have:

(scalar) values and variables;

(scalar) types -- both type names and variables to express polymorphism;

attribute names, that crucially _must_ appear same-named in both value-expressions and type-expressions -- in TUPLE{ PNO "P123", SNO "S456", QTY f(x + y) } :: TUPLE{ PNO CHAR, SNO CHAR, QTY INT }, the scalar type's names are determined from the expressions, the attribute names are not just determined, they must be exactly the same as used in the expression; so

I say attribute names are in a different namespace vs scalar values and types. (There'd be no systems requirement attribute names be distinct from variable names or types names -- merely it would confuse the heck out of the human reader.)

Not all valid CHAR values will be valid attribute/relvar names.

Neither are all possible CHAR values valid as Supplier numbers nor Part numbers -- nor even as Part descriptions. So in the catalog for attribute names, I'd use an alias for CHAR.

Constructing relvar types in which attribute names are variables is not part of TTM, perhaps with good reason.

Agreed. I think the reason would be: we'd have to use some exotic form of typing (Dynamic types? Dependent typing? pump the concocted expression out to file and incrementally compile? ... ?) OO Pre 1. D shall permit compile time type checking.

Quote from Erwin on October 1, 2022, 1:22 pm

Quote from AntC on October 1, 2022, 10:18 am

I think the position within TTM is that attribute names are not values, and vice versa.

I think that's a mischaracterization, and thinking a bit further, I think it must even necessarily be wrong.

We've circled round this issue plenty times without coming to any definite conclusion. When it comes to (Tutorial) D I'll leave the question to Hugh. (IMO the ontology of a D is sufficiently unlike the programming languages I'm familiar with, I'd be guessing.)

If a system does not support attribute names as values of a type, then what type is going to be used in the catalog to document everything related to relvars and their names, attributes and their names, types and their names, operators and their names, etc. etc ?

I mentioned an Industrial D could give you the print-name of attributes/relvars as CHAR -- then that; or a type that is an alias for CHAR.

Note that we can't from within a D ask if two attributes have the same name; or specify to get the attribute-same-name-as ATTRIBUTE FROM <tuple> from some other tuple. No variables ranging over attribute name. Any query must give the attribute name as a (kinda) literal. If you're going to say each attribute name is a distinct (something); how could one column in the catalog hold all those possible (something)s?

So on the contrary, I think it is inevitable that attribute names must be dealt with (at least at times and in certain contexts) as values of some NAME type. It's just that there's no room for "arbitrary expressions of type NAME" in certain contexts where "literals of type NAME" are [and indeed must be] allowed. And if you must do it "at least at times and in certain contexts", then you can really just as well do it (treat names as values of a type) at the other times and in the other contexts too. It's not that doing so changes anything to the significance of "being a name".

("some NAME type" sounds rather hand-wavey.) Are you to allow variables whose value is some NAME type? Are you to allow that variable to appear in an expression in the syntactic position of an attribute name, expecting 'the compilation system' will substitute in its current value? Are you to allow an iteration that destructively assigns different NAME values to that variable then re-evaluate that expression?

I don't have answers/that's why I'm not proposing 'Expressively complete' includes that sort of ability. (I think I'm already allowing greater expressivity by allowing 'tuples/relations to stand for headings' in a project/remove. If you want a 'variable' to range over different attribute names: use a tuplevar; assign differently-headinged tuples to it; now quick what's the type of that tuplevar?)

@Dandl Specifically, a practical TTM/D might be programmed in Java, and access to attribute names as values might only be exposed as a Java API.

Yeah, my take (trying to generalise as to what might be the implementation language) is that attribute names are more like types than values; but they don't participate in the type system for scalar values. We have:

(scalar) values and variables;
(scalar) types -- both type names and variables to express polymorphism;
attribute names, that crucially _must_ appear same-named in both value-expressions and type-expressions -- in TUPLE{ PNO "P123", SNO "S456", QTY f(x + y) } :: TUPLE{ PNO CHAR, SNO CHAR, QTY INT }, the scalar type's names are determined from the expressions, the attribute names are not just determined, they must be exactly the same as used in the expression; so
I say attribute names are in a different namespace vs scalar values and types. (There'd be no systems requirement attribute names be distinct from variable names or types names -- merely it would confuse the heck out of the human reader.)

Not all valid CHAR values will be valid attribute/relvar names.

Neither are all possible CHAR values valid as Supplier numbers nor Part numbers -- nor even as Part descriptions. So in the catalog for attribute names, I'd use an alias for CHAR.

Constructing relvar types in which attribute names are variables is not part of TTM, perhaps with good reason.

Agreed. I think the reason would be: we'd have to use some exotic form of typing (Dynamic types? Dependent typing? pump the concocted expression out to file and incrementally compile? ... ?) OO Pre 1. D shall permit compile time type checking.

TTM Forum

The Forum for Discussion about The Third Manifesto and Related Matters