The Forum for Discussion about The Third Manifesto and Related Matters

Please or Register to create posts and topics.

A TTM Tuple is not a Record

PreviousPage 5 of 6Next
Quote from dandl on May 7, 2020, 1:13 am
Quote from Dave Voorhis on May 6, 2020, 1:10 pm
Quote from dandl on May 6, 2020, 11:04 am
Quote from Dave Voorhis on May 6, 2020, 9:11 am
Quote from dandl on May 6, 2020, 8:44 am

I thik TD confuses things by not making it clear that tuple types as defined by {X,Y}, {Y,X}, {X,Y,Z ALLBUT Z} and {X,Y,W} JOIN {X,Y,Z} are not 'structurally equivalent', they are all exactly the same unique type.

"Structurally equivalent" and "the same unique type" are synonymous under structural typing.

Pardon? That's either meaningless or wrong. Or something.

No, it simply means that "same type" and "same structure" are the same thing.

They're not. Structural equivalence means two separate types with different names but the same structure can be used interchangeably: a value of one type can be assigned to the other, either can be compared with the other, either can be passed as a parameter, etc. Module-3 has named types and structural equivalence between them. Name equivalence means that is possible for two type with different names to be defined to be the same type (an alias). C/C++ have that. Most languages including Java, C# and TTM/D have no type equivalence, every type is unique by name.

Structural typing simply means type-compatibility is determined by structure rather than by name.

Agreed. As per Module-3, two different types are interchangeable if they have the same structure. Note: different types, same structure, always substitutable.

Tutorial D tuple/relation type-compatibility is determined structurally.

No, in TD two declared types are the same type if they have the same heading. Note: different declarations, same type (so nothing to substitute).

Java and C# are nominative typing; different type names mean different types even if the structure is the same.

Agreed. Two different types are interchangeable under the limited rules of assignment, covariance and contra-variance. Note: different types, limited substitutability based on inheritance, not on structure.

I'm surprised you find this so hard to understand. Perhaps it's the odd wording in TTM.

Likewise, I guess...

Yes, every value belongs to exactly one type, but two syntactically distinct headings -- which are provably syntactically distinct -- represent the same tuple/relation type or equivalent tuple/relation types. Again, these are synonyms. There is no semantic distinction.

They don't 'represent'. Those two syntactic forms define the same type by the same name. There is no type equivalence, because there are no separate types to be equivalent. Each tuple type is unique by heading.

No, the same heading applied to two different tuples means they have the same type.

Correct! The very same type.

I was being loose. More rigorously, whether they are the very same type due to the same structure (for some definition of "same structure"), or distinct in some other way but substitutable due to the same structure (for some definition of "same structure"), is dependent on the specific type system.

In general, structural typing is simply that equivalence or substitutability are determined by structure rather than name. Nominative typing is simply that equivalence or substitutability are determined by name rather than structure.

The original issue was how you are dealing with the C# equivalents of what in Tutorial D would be TUPLE {x INT, y CHAR} and TUPLE {y CHAR, x INT}. You've indicated that this doesn't come up, but it's not clear how it doesn't come up. This is commonly raised when using one of the popular programming languages as a D or the direct basis for a D. It's still not clear to me how you've dealt with that, and your answers so far seemed more evasive -- probably not intentionally -- than clarifying.

There is an interesting problem in TTM, which I completely missed until now. The very prescriptive language in RM Pre 6 talks about 'defining...operators of that type'. Then it talks about 'the applicable operators shall include...'. (The references to 'RENAME,project,etc' are obviously mistakes left over from an earlier iteration, but no matter.)

The problem is "(b) an operator for extracting a specified attribute value from a specified tuple (the tuple in question might be required to be of degree one—see RM Prescription 9".  It's hard to see exactly what is intended here, or what will satisfy it, but I don't think as written it's what was intended. I think having getters is enough.

In Tutorial D, that's the FROM operator.

I know, and in Andl it's a dot operator, but TD and Andl do not set the requirement. That's TTM's job (give or take the odd mistake). And this requirement is  problematic (as written).

Ah, you mean the "the tuple in question might be required to be of degree one..."?

I presume "an operator for extracting a specified attribute value from a specified tuple" isn't controversial.

It's still somewhat problematic. The form of words implies that (a) there is to be only one such operator for each tuple type and (b) it should take an argument to 'specify' which attribute value is required. The wording for retrieving a component of a scalar component is better in this respect. I fail to see why TD has such disparate forms for these two relatively similar kinds of operators, but AFAICT something akin to idiomatic 'getters' in the target language will do just fine.

Sure you're not reading more into it than what appears to have been intended?

The requirement seems simple and straightforward: you need a mechanism to retrieve an attribute value from a tuple. Somehow.

That isn't what it says. The form of words in RM Pre 5a is clear. The form in RM Pre 6 is very different, and in my view wrong. We both know what it ought to say, but it doesn't say that.

Sorry, I don't know what it ought to say but doesn't say, and I don't understand why you think it's wrong. It simply implies the reasonable possibility that the value of a particular tuple attribute may be obtained by prior projection (leaving a tuple of degree 1, and thus the desired attribute is unambiguous) rather than by name or position.

Thus, RM Pre 6 seems clear and no less clear than RM Pre 5a.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org

The original issue was how you are dealing with the C# equivalents of what in Tutorial D would be TUPLE {x INT, y CHAR} and TUPLE {y CHAR, x INT}. You've indicated that this doesn't come up, but it's not clear how it doesn't come up. This is commonly raised when using one of the popular programming languages as a D or the direct basis for a D. It's still not clear to me how you've dealt with that, and your answers so far seemed more evasive -- probably not intentionally -- than clarifying.

What you've quoted is a syntactic form unique to that language. It does not exist in any other, and put like that it confuses the picture.

Each of these declarations creates a type with the name 'TUPLE H', where H is a set comprising the names x and y (and their corresponding types). In my C# declarations the heading is likewise a set: HashSet<string> { "x", "y" }. As per TTM there can be only one tuple type declared with this heading. The following is a valid declaration:

public class TupPoint : TupleBase {
  public readonly static string[] Heading = { "X", "Y" };
}

[The heading will be treated as a set internally.]

Andl - A New Database Language - andl.org
Quote from dandl on May 7, 2020, 11:29 am

The original issue was how you are dealing with the C# equivalents of what in Tutorial D would be TUPLE {x INT, y CHAR} and TUPLE {y CHAR, x INT}. You've indicated that this doesn't come up, but it's not clear how it doesn't come up. This is commonly raised when using one of the popular programming languages as a D or the direct basis for a D. It's still not clear to me how you've dealt with that, and your answers so far seemed more evasive -- probably not intentionally -- than clarifying.

What you've quoted is a syntactic form unique to that language. It does not exist in any other, and put like that it confuses the picture.

Yes, but it's syntax in a TTM-compliant language with TTM-conforming semantics, and as the lingua franca of TTM it makes sense to compare Tutorial D examples to other languages.

Are you essentially claiming that this sort of structural typing is unique to Tutorial D and not prescribed?

To me, it seems quite strongly implied by RM Pre 9.

Each of these declarations creates a type with the name 'TUPLE H', where H is a set comprising the names x and y (and their corresponding types). In my C# declarations the heading is likewise a set: HashSet<string> { "x", "y" }. As per TTM there can be only one tuple type declared with this heading. The following is a valid declaration:

public class TupPoint : TupleBase {
public readonly static string[] Heading = { "X", "Y" };
}
public class TupPoint : TupleBase { public readonly static string[] Heading = { "X", "Y" }; }
public class TupPoint : TupleBase {
  public readonly static string[] Heading = { "X", "Y" };
}

[The heading will be treated as a set internally.]

This precludes static type-checking at a C# level, doesn't it?

It certainly doesn't preclude type-checking before evaluating a relational expression, but then it's not statically checked per OO Pre 1.

Though I suppose an easy out is to simply state that heading compliance -- and thus relational expressions in general(?) -- isn't statically checked but other things are.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
Quote from Dave Voorhis on May 7, 2020, 11:41 am
Quote from dandl on May 7, 2020, 11:29 am

The original issue was how you are dealing with the C# equivalents of what in Tutorial D would be TUPLE {x INT, y CHAR} and TUPLE {y CHAR, x INT}. You've indicated that this doesn't come up, but it's not clear how it doesn't come up. This is commonly raised when using one of the popular programming languages as a D or the direct basis for a D. It's still not clear to me how you've dealt with that, and your answers so far seemed more evasive -- probably not intentionally -- than clarifying.

What you've quoted is a syntactic form unique to that language. It does not exist in any other, and put like that it confuses the picture.

Yes, but it's syntax in a TTM-compliant language with TTM-conforming semantics, and as the lingua franca of TTM it makes sense to compare Tutorial D examples to other languages.

Are you essentially claiming that this sort of structural typing is unique to Tutorial D and not prescribed?

I'm saying that (a) this is not type equivalence of any kind and (b) the syntactic form for writing a heading as a set of attributes in any specific language is not prescribed.

This precludes static type-checking at a C# level, doesn't it?

It certainly doesn't preclude type-checking before evaluating a relational expression, but then it's not statically checked per OO Pre 1.

Though I suppose an easy out is to simply state that heading compliance -- and thus relational expressions in general(?) -- isn't statically checked but other things are.

What you get is strong type checking, to a point. Each relation type and tuple type is checked for correct usage and they will not produce type errors at run-time. There are no type errors for programs that follow the rules, and the rules are statically determined. But you cannot get the regular C# compiler to check things like:

  • the heading is unique (set-wise)
  • the names of any selectors or getters (if present) match the heading
  • the types of selectors and getters match
  • no two attributes have the same name but a different type
  • headings are compatible (eg for set operations).

So you either treat these as 'undefined behaviour', or if you want to enforce the rules you can do that at runtime, or you need a pre-processor or modified compiler. First I want to see if there is a plausible set of rules to follow and code that makes sense, before how I think about how to enforce it.

Andl - A New Database Language - andl.org
Quote from dandl on May 7, 2020, 2:01 pm
Quote from Dave Voorhis on May 7, 2020, 11:41 am
Quote from dandl on May 7, 2020, 11:29 am

The original issue was how you are dealing with the C# equivalents of what in Tutorial D would be TUPLE {x INT, y CHAR} and TUPLE {y CHAR, x INT}. You've indicated that this doesn't come up, but it's not clear how it doesn't come up. This is commonly raised when using one of the popular programming languages as a D or the direct basis for a D. It's still not clear to me how you've dealt with that, and your answers so far seemed more evasive -- probably not intentionally -- than clarifying.

What you've quoted is a syntactic form unique to that language. It does not exist in any other, and put like that it confuses the picture.

Yes, but it's syntax in a TTM-compliant language with TTM-conforming semantics, and as the lingua franca of TTM it makes sense to compare Tutorial D examples to other languages.

Are you essentially claiming that this sort of structural typing is unique to Tutorial D and not prescribed?

I'm saying that (a) this is not type equivalence of any kind and (b) the syntactic form for writing a heading as a set of attributes in any specific language is not prescribed.

The specific syntactic form is irrelevant -- I picked it for illustration -- but what I'm getting at is that TTM apparently considers the ability to do the following to be a good idea, per Tutorial D:

VAR t1 TUPLE {x INT, y CHAR};
VAR t2 TUPLE {y CHAR, x INT};
t1 := TUPLE {x 1, y 'blah'};
t2 := TUPLE {y 'blah', x 1};
WRITELN t1 = t2;

It compiles and runs and emits 'true', so it looks like the type declared for t1 is equivalent to or type-compatible with the type declared for t2. Whether they're the "same type" or not is irrelevant here; they might or might not be. Either way, they are unquestionably equivalent.

I've been trying to understand how you deal with this or whatever might be akin to it, or related to it -- if at all -- in C#.

But this answers it:

This precludes static type-checking at a C# level, doesn't it?

It certainly doesn't preclude type-checking before evaluating a relational expression, but then it's not statically checked per OO Pre 1.

Though I suppose an easy out is to simply state that heading compliance -- and thus relational expressions in general(?) -- isn't statically checked but other things are.

What you get is strong type checking, to a point. Each relation type and tuple type is checked for correct usage and they will not produce type errors at run-time. There are no type errors for programs that follow the rules, and the rules are statically determined. But you cannot get the regular C# compiler to check things like:

  • the heading is unique (set-wise)
  • the names of any selectors or getters (if present) match the heading
  • the types of selectors and getters match
  • no two attributes have the same name but a different type
  • headings are compatible (eg for set operations).

So you either treat these as 'undefined behaviour', or if you want to enforce the rules you can do that at runtime, or you need a pre-processor or modified compiler. First I want to see if there is a plausible set of rules to follow and code that makes sense, before how I think about how to enforce it.

Yes, this is the usual set of problems using C# or Java (or whatever popular language) to implement TTM. It wasn't clear whether you found a way around that or not.

That's why the typical solution is to create a "pre-processor" that accepts, say, Tutorial D or SIRA_PRISE or Andl or <insert D here> as input, tests for correctness, and passes the correct code to the host language environment for (possible compilation and) execution.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
Quote from dandl on May 7, 2020, 11:29 am

Each of these declarations creates a type with the name 'TUPLE H', where H is a set comprising the names x and y (and their corresponding types). In my C# declarations the heading is likewise a set: HashSet<string> { "x", "y" }. ...

To quote Master Yoda : that is why you fail.

Sets of (A) monothings are not the same thing as sets of (A,T) pairs.  TTM headings are the latter, what your sacred languages give you is only the former.  And that is precisely the PROBLEM, that is precisely the REASON why TTM was ever written in the first place.

FWIW, I'll add what the NonScalarTypeDeclaration class currently looks like in SIRA_PRISE (as a PDF because the allowed file types don't include any of .txt .java .c .sh and in fact almost .whatever).  Think of it what you want, but doing better, in the sense of the programmer who is using this thing achieving the same with much less code, will require EXACTLY what TTM says is needed : a new language.  A language that EMBRACES nonscalar types IN THE LANGUAGE SPEC, as opposed to what the existing set of sacred languages such as C# and Java currently do, leaving it to some programmer (possibly a library programmer) to get done at run-time what the TTM language compiler would do at compile time.  As an example, the code has some methods to determine, given two type declarations, the common subtype and the common supertype declaration.  Runs at run-time.  Despite everything, I'm confident you'll appreciate why someone wouldn't want that in certain specific circumstances.

Uploaded files:
Author of SIRA_PRISE
VAR t1 INT;
VAR t2 INT;
t1 := 7;
t2 := 4+3;
WRITELN t1 = t2;

It compiles and runs and emits 'true', because it's the very same value. The integer value 7 is not 'equivalent' to the integer value 7, it's the same value.

If you want to demonstrate equivalence, you have to show that two separate entities retain their separate identities but are in some sense interchangeable.

Yes, this is the usual set of problems using C# or Java (or whatever popular language) to implement TTM. It wasn't clear whether you found a way around that or not.

That's why the typical solution is to create a "pre-processor" that accepts, say, Tutorial D or SIRA_PRISE or Andl or <insert D here> as input, tests for correctness, and passes the correct code to the host language environment for (possible compilation and) execution.

No, it isn't. Whether this turns out to be a good idea or not, every line of code is only and ever bog-standard C#, with some rules to be followed.

  • You can write and debug it manually.
  • You can defer checking the rules until run-time.
  • You can create a pre-processor or modified compiler that would check the rules (basically for correct usage of classes that inherit from TupleBase or RelationBase).

To make life easier you might also:

  • Create a generator to output pre-written classes (using data as input, or by parsing the C# code, as is already being done)
  • Add a tweak to the language or the compiler to smooth over some bumps, as Java and C# have done for various new features along the way.

But no, I have no intention of creating a version of Andl that compiles to C#, or anything remotely like it.

Andl - A New Database Language - andl.org
Quote from Erwin on May 7, 2020, 7:06 pm
Quote from dandl on May 7, 2020, 11:29 am

Each of these declarations creates a type with the name 'TUPLE H', where H is a set comprising the names x and y (and their corresponding types). In my C# declarations the heading is likewise a set: HashSet<string> { "x", "y" }. ...

To quote Master Yoda : that is why you fail.

Sets of (A) monothings are not the same thing as sets of (A,T) pairs.  TTM headings are the latter, what your sacred languages give you is only the former.  And that is precisely the PROBLEM, that is precisely the REASON why TTM was ever written in the first place.

You're a bit slow, I wondered when someone would raise this. In point of fact, the wording of RM Pre 9 defines a heading that is a set of names, each of which is associated with a type. The set membership criterion is not the pair, it's the name only. The name A must be unique, the type T must be the type of that name. My proposal does exactly that: set of names, each name as associate type, as per RM Pre 9.

FWIW, I'll add what the NonScalarTypeDeclaration class currently looks like in SIRA_PRISE (as a PDF because the allowed file types don't include any of .txt .java .c .sh and in fact almost .whatever).  Think of it what you want, but doing better, in the sense of the programmer who is using this thing achieving the same with much less code, will require EXACTLY what TTM says is needed : a new language.  A language that EMBRACES nonscalar types IN THE LANGUAGE SPEC, as opposed to what the existing set of sacred languages such as C# and Java currently do, leaving it to some programmer (possibly a library programmer) to get done at run-time what the TTM language compiler would do at compile time.  As an example, the code has some methods to determine, given two type declarations, the common subtype and the common supertype declaration.  Runs at run-time.  Despite everything, I'm confident you'll appreciate why someone wouldn't want that in certain specific circumstances

I'm not entirely sure what point you're making but I probably agree. I am not investigating whether a GP language such as C# makes a really good D, only whether it's reasonably possible. Yes, the big practical problem in using C# is precisely that: the work involved in creating all those types: (a) properly-functioning value types (b) source and destination tuple and relation types (c) all the intermediate types that arise in the course of evaluating a relational expression. The work is reduced in C# by farming out some of it to struct, generics and inheritance, but it still leaves a lot for the programmer to do. The type-focussed approach in D is not an ideal match for C#.

But if it is possible, and I think it is, then it's worth thinking about how to reduce that load so that the amount of code you have to write in C# (or Java or whatever) is similar to those custom languages.

Andl - A New Database Language - andl.org
Quote from dandl on May 8, 2020, 2:08 am
Quote from Erwin on May 7, 2020, 7:06 pm
Quote from dandl on May 7, 2020, 11:29 am

Each of these declarations creates a type with the name 'TUPLE H', where H is a set comprising the names x and y (and their corresponding types). In my C# declarations the heading is likewise a set: HashSet<string> { "x", "y" }. ...

To quote Master Yoda : that is why you fail.

Sets of (A) monothings are not the same thing as sets of (A,T) pairs.  TTM headings are the latter, what your sacred languages give you is only the former.  And that is precisely the PROBLEM, that is precisely the REASON why TTM was ever written in the first place.

You're a bit slow, I wondered when someone would raise this. In point of fact, the wording of RM Pre 9 defines a heading that is a set of names, each of which is associated with a type. The set membership criterion is not the pair, it's the name only. The name A must be unique, the type T must be the type of that name. My proposal does exactly that: set of names, each name as associate type, as per RM Pre 9.

FWIW, I'll add what the NonScalarTypeDeclaration class currently looks like in SIRA_PRISE (as a PDF because the allowed file types don't include any of .txt .java .c .sh and in fact almost .whatever).  Think of it what you want, but doing better, in the sense of the programmer who is using this thing achieving the same with much less code, will require EXACTLY what TTM says is needed : a new language.  A language that EMBRACES nonscalar types IN THE LANGUAGE SPEC, as opposed to what the existing set of sacred languages such as C# and Java currently do, leaving it to some programmer (possibly a library programmer) to get done at run-time what the TTM language compiler would do at compile time.  As an example, the code has some methods to determine, given two type declarations, the common subtype and the common supertype declaration.  Runs at run-time.  Despite everything, I'm confident you'll appreciate why someone wouldn't want that in certain specific circumstances

I'm not entirely sure what point you're making but I probably agree. I am not investigating whether a GP language such as C# makes a really good D, only whether it's reasonably possible.

I would have thought Andl itself would be sufficient proof that it's possible. Presumably, the Andl parser is manipulating constructs in a C# environment in exactly the same way a programmer could natively. Strip off the parser, unplug the TTM non-IM type system and replace it with native C# types and the job's done. No?

I mean, that's essentially what Rel is, except Java instead of C#. Low-level programmatic mechanisms for implementing the relational model plus Java is...  It. How else would it be written?

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
Quote from dandl on May 8, 2020, 1:41 am
VAR t1 INT;
VAR t2 INT;
t1 := 7;
t2 := 4+3;
WRITELN t1 = t2;

It compiles and runs and emits 'true', because it's the very same value. The integer value 7 is not 'equivalent' to the integer value 7, it's the same value.

But is 7 and 7.0 the same value?

I didn't want to get into a debate about whether tuple definitions -- whether in Tutorial D or, per the topic, in C# -- were equivalent or the same, as I could easily imagine a C# system that, say, converts one C#-based tuple type into another equivalent-but-not-the-same C#-based tuple type. That would presumably be compliant.

Equivalence is a superset of identity -- it could be equivalent because tuple definitions are the same tuple definition or because they're different. It wouldn't matter either way, as equivalence is sufficient.

If you want to demonstrate equivalence, you have to show that two separate entities retain their separate identities but are in some sense interchangeable.

To show equivalence, I only have to show that the entities are interchangeable. Whether or not they are separate identities is orthogonal to equivalence.

Yes, this is the usual set of problems using C# or Java (or whatever popular language) to implement TTM. It wasn't clear whether you found a way around that or not.

That's why the typical solution is to create a "pre-processor" that accepts, say, Tutorial D or SIRA_PRISE or Andl or <insert D here> as input, tests for correctness, and passes the correct code to the host language environment for (possible compilation and) execution.

No, it isn't.

It is. A C#/Java-native implementation of the relational model doesn't have to be a transpiler that accepts Tutorial D or SIRA_PRISE or Andl or <insert D here> and emits Java or C# source code. It's entirely reasonable for it to be a standalone relational library for Java or C# that doesn't need any new static definitions (though even those can be dynamically compiled, at least in Java) if it supports something like this:

...
var heading = Heading.create()
   .attribute("S#", String.class)
   .attribute("NAME", String.class)
   .attribute("STATUS", Integer.class)
   .attribute("CITY", String.class);
var s = Relation.create(heading)
   .tuple().attribute("S#", "S1").attribute("NAME", "Smith").attribute("STATUS", 20).attribute("CITY", "London")
   .tuple().attribute("S#", "S2").attribute("NAME", "Jones").attribute("STATUS", 10).attribute("CITY", "Paris")
   .tuple().attribute("S#", "S3").attribute("NAME", "Blake").attribute("STATUS", 30).attribute("CITY", "Paris")
   .tuple().attribute("S#", "S4").attribute("NAME", "Clark").attribute("STATUS", 20).attribute("CITY", "London")
   .tuple().attribute("S#", "S5").attribute("NAME", "Adams").attribute("STATUS", 30).attribute("CITY", "Athens");
System.out.println(s.join(p));

Everything there can be done purely at runtime without needing to compile any static definitions.

Of course, all relational type (and other) validity checking would have to happen at runtime, as early as possible but obviously only on a per statement/expression basis.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
PreviousPage 5 of 6Next