The type of <TCLOSE>

#21 · July 15, 2021, 12:56 pm

Quote from dandl on July 15, 2021, 12:56 pm

Quote from AntC on July 13, 2021, 1:37 am

Quote from Erwin on July 12, 2021, 9:00 pm

Quote from AntC on July 12, 2021, 9:42 am

Neither of us claimed something was "the same"; let alone "precisely" anything. It's Erwin introduced those words.

Is there any substantial issue here? Or are we merely disagreeing how far to stretch "the same" vs "precisely the same" vs "[not] precisely the same", with a side-issue of the capitalisation and asteriskisation for 'not'?

Every TCLOSE executes precisely the same algorithm, so it's generic rather than overloaded. Dandl on July 11, 2021, 10:12 am.

Neither @dandl nor @Erwin have defined 'generic' here. So all I'm left with is hand-waving. ("generics"/generic programming as a PLT term has too wide a range of uses to be helpful. I notice that wiki's intro para talks about both overloading and parametric types -- which is what I've appealed to. Do you(s) mean something more?)

The meaning of 'generic' in the context of OO languages is really not in doubt. Prior to C++ templates, the only way these languages could provide collection classes and similar was using inheritance. To do that, every object had to inherit from a root object, and type checking was limited. For C++ this answer was not viable, mainly because of value types.

So C++ template algorithms are written with placemarker types, stored in the compiler symbol table, and specialised to a particular type on use. C# adopted the same strategy for the same reasons, and eventually Java followed suit.

So the distinction is clear:

overloaded functions are authored explicitly for each type, and the compiler picks which one to use based on the type signature

generic functions are authored using type placemarkers, and the compiler substitutes the required type when used.

Every equality test or comparison for Char Strings executes the same algorithm of scanning from left to right. Does that make it generic?

Actually that isn't so, but no matter. In C++ there are both wide and narrow character types, so string is generic. In C# the equality test is overloaded with options such as national language and case sensitivity. Your point?

Equality tests in pointer-based languages tend to execute the same algorithm of testing pointers rather than the value, irrespective of the type of the value. Does that make that generic? Or not overloaded? Or Parametric? Or what?

No. Value types are tested one way, reference types another. Some reference types (like string) are tested by value. Overloaded, not generic.

Each TCLOSE implements a different equality test as part of its algorithm, depending on the type of the attribute. Doesn't that make it overloaded?

I've explained it as TCLOSE uses parametric typing in its recursion part and overloading for its attribute value-compare part. Calling some of that 'generic' doesn't add further explanation IMO.

If TCLOSE were expressed as a C# generic the signature might be TCLOSE<T>(T a, T b). The same algorithm is specialised as to type, at which point the proper equality test is picked. Perhaps this is Equals<T>(T a, T b) so there is a further round of specialisation. But why do you care: this is just implementation detail?

There is no hand-waving, nothing odd, all perfectly straightforward.

Quote from AntC on July 13, 2021, 1:37 am

Quote from Erwin on July 12, 2021, 9:00 pm

Quote from AntC on July 12, 2021, 9:42 am

Neither of us claimed something was "the same"; let alone "precisely" anything. It's Erwin introduced those words.

Is there any substantial issue here? Or are we merely disagreeing how far to stretch "the same" vs "precisely the same" vs "[not] precisely the same", with a side-issue of the capitalisation and asteriskisation for 'not'?

Every TCLOSE executes precisely the same algorithm, so it's generic rather than overloaded. Dandl on July 11, 2021, 10:12 am.

Neither @dandl nor @Erwin have defined 'generic' here. So all I'm left with is hand-waving. ("generics"/generic programming as a PLT term has too wide a range of uses to be helpful. I notice that wiki's intro para talks about both overloading and parametric types -- which is what I've appealed to. Do you(s) mean something more?)

The meaning of 'generic' in the context of OO languages is really not in doubt. Prior to C++ templates, the only way these languages could provide collection classes and similar was using inheritance. To do that, every object had to inherit from a root object, and type checking was limited. For C++ this answer was not viable, mainly because of value types.

So C++ template algorithms are written with placemarker types, stored in the compiler symbol table, and specialised to a particular type on use. C# adopted the same strategy for the same reasons, and eventually Java followed suit.

So the distinction is clear:

overloaded functions are authored explicitly for each type, and the compiler picks which one to use based on the type signature
generic functions are authored using type placemarkers, and the compiler substitutes the required type when used.

Every equality test or comparison for Char Strings executes the same algorithm of scanning from left to right. Does that make it generic?

Actually that isn't so, but no matter. In C++ there are both wide and narrow character types, so string is generic. In C# the equality test is overloaded with options such as national language and case sensitivity. Your point?

Equality tests in pointer-based languages tend to execute the same algorithm of testing pointers rather than the value, irrespective of the type of the value. Does that make that generic? Or not overloaded? Or Parametric? Or what?

No. Value types are tested one way, reference types another. Some reference types (like string) are tested by value. Overloaded, not generic.

Each TCLOSE implements a different equality test as part of its algorithm, depending on the type of the attribute. Doesn't that make it overloaded?

I've explained it as TCLOSE uses parametric typing in its recursion part and overloading for its attribute value-compare part. Calling some of that 'generic' doesn't add further explanation IMO.

If TCLOSE were expressed as a C# generic the signature might be TCLOSE<T>(T a, T b). The same algorithm is specialised as to type, at which point the proper equality test is picked. Perhaps this is Equals<T>(T a, T b) so there is a further round of specialisation. But why do you care: this is just implementation detail?

There is no hand-waving, nothing odd, all perfectly straightforward.

Andl - A New Database Language - andl.org

#22 · July 15, 2021, 7:02 pm

Quote from Erwin on July 15, 2021, 7:02 pm

Quote from AntC on July 14, 2021, 10:48 pm

Quote from Erwin on July 14, 2021, 1:07 pm

Quote from AntC on July 12, 2021, 9:42 am

The part where I think TTM doesn't stretch is the requirement that an eligible operand must have precisely two attributes; and that they have the same attribute type. Does the mechanism in Appendix A for specifying same attribute name-same attribute type-different relation operands (as needed for 'join compatible') stretch to different name-samte type-different relation operands (as needed for 'join compatible') stretch to different name-same attribute type within one heading? The example operator definition for TRANCLO in DTATRM Ch 6/discussion of RM Pre 19 has hard-coded literal attribute names and type RELATION { X P#, Y P# }.e attribute type within one heading? The example operator definition for TRANCLO in DTATRM Ch 6/discussion of RM Pre 19 has hard-coded literal attribute names and type RELATION { X P#, Y P# }.

Remember we're using 'named perspective'. If we're trying to specify something generic (by which I mean flexible as to attribute-name and attribute-type), we can't say 'compare first attribute's value to second attribute's'; there's no notion of position. Can we say 'the value in this tuple for the attribute we've parameterised as a compared to the value in that tuple for the attribute we've parameterised as b'?

If you're facing a transitive closure scenario where the identifiers are composite (say, a CHAR and an INT) and you get relations of degree > 2 (say, {P1 CHAR, P2 INT, C1 CHAR, C2 INT} that you do want to TCLOSE over, then you can do two things :

require the user to first mould this relation into a form that satisfies the "binary" and "same attribute type" requirements, by using WRAPs and RENAMEs to obtain, say, a relation {P TUP{X CHAR, Y INT}, C TUP{X CHAR, Y INT}}. But that's RENAME (P1->X P2->Y), WRAP (X,Y into P), RENAME again (C1->X C2->Y), WRAP again (X,Y into C). Tedious. And if the result of the TCLOSE must be exactly the same type as the input relation for some reason (e.g. because we only want to retain the tuples that are not in the input relation and so we need to do an additional MINUS), then there's yet another two UNWRAPs and RENAMEs to be added in the mix.

ditch the "binary" requirement and facilitate the user saying to the system which attributes are to be matched with which

SIRA_PRISE does the latter and the syntax for invoking TCLOSE thus also includes an <attr name matching list> component : TCLOSE ( R , (P1,C1,P2,C2) ). Telling the system that in the matching process, P1 is to be compared with C1 and P2 with C2, or iow, the equality test to be used is (P1,P2) pairs to (C1,C2) pairs.

Is my understanding correct that this is what your "does not stretch" is about

No. I'm talking about TCLOSE over a two-attribute relation, as spec'd in DTATRM.

and does this look like a solution ?

No, because you've given yourself essentially the same problem in a different form: in TCLOSE ( R , (P1,C1,P2,C2) ) you need:

the second argument to denote the whole heading of R;

P1 to have the same attribute type as C1, P2 as C2.

I don't know what you're referring to by "the mechanism in Appendix A for specifying ...".

See FORMAL DEFINITIONS -- for example, for <AND>.

I don't remember any "mechanism". I'll revisit.

I'd say "Let s = <TCLOSE> r. It is required that there exists some A1,A2,T such that Hr = {(A1,T), (A2,T)}".

Quote from AntC on July 14, 2021, 10:48 pm

Quote from Erwin on July 14, 2021, 1:07 pm

Quote from AntC on July 12, 2021, 9:42 am

The part where I think TTM doesn't stretch is the requirement that an eligible operand must have precisely two attributes; and that they have the same attribute type. Does the mechanism in Appendix A for specifying same attribute name-same attribute type-different relation operands (as needed for 'join compatible') stretch to different name-samte type-different relation operands (as needed for 'join compatible') stretch to different name-same attribute type within one heading? The example operator definition for TRANCLO in DTATRM Ch 6/discussion of RM Pre 19 has hard-coded literal attribute names and type RELATION { X P#, Y P# }.e attribute type within one heading? The example operator definition for TRANCLO in DTATRM Ch 6/discussion of RM Pre 19 has hard-coded literal attribute names and type RELATION { X P#, Y P# }.

Remember we're using 'named perspective'. If we're trying to specify something generic (by which I mean flexible as to attribute-name and attribute-type), we can't say 'compare first attribute's value to second attribute's'; there's no notion of position. Can we say 'the value in this tuple for the attribute we've parameterised as a compared to the value in that tuple for the attribute we've parameterised as b'?

If you're facing a transitive closure scenario where the identifiers are composite (say, a CHAR and an INT) and you get relations of degree > 2 (say, {P1 CHAR, P2 INT, C1 CHAR, C2 INT} that you do want to TCLOSE over, then you can do two things :

require the user to first mould this relation into a form that satisfies the "binary" and "same attribute type" requirements, by using WRAPs and RENAMEs to obtain, say, a relation {P TUP{X CHAR, Y INT}, C TUP{X CHAR, Y INT}}. But that's RENAME (P1->X P2->Y), WRAP (X,Y into P), RENAME again (C1->X C2->Y), WRAP again (X,Y into C). Tedious. And if the result of the TCLOSE must be exactly the same type as the input relation for some reason (e.g. because we only want to retain the tuples that are not in the input relation and so we need to do an additional MINUS), then there's yet another two UNWRAPs and RENAMEs to be added in the mix.

ditch the "binary" requirement and facilitate the user saying to the system which attributes are to be matched with which

SIRA_PRISE does the latter and the syntax for invoking TCLOSE thus also includes an <attr name matching list> component : TCLOSE ( R , (P1,C1,P2,C2) ). Telling the system that in the matching process, P1 is to be compared with C1 and P2 with C2, or iow, the equality test to be used is (P1,P2) pairs to (C1,C2) pairs.

Is my understanding correct that this is what your "does not stretch" is about

No. I'm talking about TCLOSE over a two-attribute relation, as spec'd in DTATRM.

and does this look like a solution ?

No, because you've given yourself essentially the same problem in a different form: in TCLOSE ( R , (P1,C1,P2,C2) ) you need:

the second argument to denote the whole heading of R;

P1 to have the same attribute type as C1, P2 as C2.

I don't know what you're referring to by "the mechanism in Appendix A for specifying ...".

See FORMAL DEFINITIONS -- for example, for <AND>.

I don't remember any "mechanism". I'll revisit.

I'd say "Let s = <TCLOSE> r. It is required that there exists some A1,A2,T such that Hr = {(A1,T), (A2,T)}".

Author of SIRA_PRISE

#23 · July 16, 2021, 12:46 am

No. I'm talking about TCLOSE over a two-attribute relation, as spec'd in DTATRM.

and does this look like a solution ?

No, because you've given yourself essentially the same problem in a different form: in TCLOSE ( R , (P1,C1,P2,C2) ) you need:

the second argument to denote the whole heading of R;

P1 to have the same attribute type as C1, P2 as C2.

I don't know what you're referring to by "the mechanism in Appendix A for specifying ...".

See FORMAL DEFINITIONS -- for example, for <AND>.

I don't remember any "mechanism". I'll revisit.

I'd say "Let s = <TCLOSE> r. It is required that there exists some A1,A2,T such that Hr = {(A1,T), (A2,T)}".

I think this generalises perfectly well to cases where the linking field is not a single attribute, provided they come in pairs of distinct types.

Or you could implement while, in which case you get to say exactly what you want.

Sooner or later you have to admit that TCLOSE is a degenerate form of fixed point recursion , and that there exist many queries that while or SQL CTE can express and TCLOSE cannot. The DTATRM discussion of RM VSS 6 (which TD does not implement) and the example given but not solved is just one of them.

Of if you don't like while, I have provided a formal definition of ETCLOSE which satisfies RM VSS 6 but falls short of SQL CTE.

Andl - A New Database Language - andl.org

TTM Forum

The Forum for Discussion about The Third Manifesto and Related Matters

The type of <TCLOSE>