The Forum for Discussion about The Third Manifesto and Related Matters

Please or Register to create posts and topics.

A TTM Tuple is not a Record

PreviousPage 3 of 6Next
Quote from AntC on May 5, 2020, 10:26 am
Quote from Dave Voorhis on May 5, 2020, 8:22 am
Quote from dandl on May 5, 2020, 4:30 am
Quote from Dave Voorhis on May 4, 2020, 9:54 am
Quote from dandl on May 4, 2020, 1:19 am
Quote from Dave Voorhis on May 3, 2020, 3:30 pm
Quote from dandl on May 3, 2020, 2:49 pm
Quote from Dave Voorhis on May 3, 2020, 7:14 am

I didn't mean RM Pre 18. I meant OO Pre 1.

 

They have separate definitions. TupS is one type, TupP is a different type.

 

The point I wanted to make is while the target languages have various kinds of type compatibility (such as assignment compatibility, covariance and contra-variance), these are not TTM requirements. TTM has the mysterious observation that 'Tuple types TUPLE H1 and TUPLE H2 shall be equal if and only if H1 = H2', but type equality is not defined.

Headings are sets; sets of <A,T> ordered pairs. Two sets are equal just in case they have the same member elements. Comparing Attributes and Types to determine the equality of <A,T> is in the metalanguage, not within the semantics of a D . In implementation terms, I'd say that's in the type system, so the comparison might be static at compile time.

I can't disagree, but the relevance is not obvious. TTM does not allow two different tuple or relation types that have the same heading: for any given heading, there is exactly one corresponding tuple type and one relation type. TD allows them to be 'generated' in different ways, but they always generate the same type, named precisely 'TUPLE H' and 'RELATION H' respectively. That's not a complicated requirement, it seems to me.

Lumping all those together is resulting in mush. Perhaps those features of those languages are related historically. Gawdelpus if you're trying to include REDEFINES: what a crock of ordure. There's no core concept sufficiently in common between those examples to give me any idea what it is you're claiming a TTM tuple is not. I might say a Pascal record is not a Record.

As you like, but you didn't have too much difficulty figuring out which pile of mush I was talking about. Oh and BTW see JEP 359: https://openjdk.java.net/jeps/359 or https://dzone.com/articles/a-first-look-at-records-in-java-14. Looks like records to me.

All I'm saying is: TTM UDTs with components as per RM Pre 4 look a whole lot like class or record in Java, class or struct in C#, record in Delphi/Pascal and so on. TTM tuples don't look like that, so let's not confuse them. They're different.

Explaining yourself with C# code is again not conveying anything. I wouldn't know whether it's trivial or strange. If you have some important not-a-record concept, you should be able to explain it at the PossRep level comparable to TTM's Pres and Pros. A possible Pro might be: there shall be no way to access fields by position. In which case Haskell's datatypes and Java's new wotsit are already excluded.

As I tried to explain before, I'm using the facilities of a GP language to implement the features of TTM/D. The aim at this stage is to get stuff to actually work, and I think I can do that. However, there are several areas in TTM which concern themselves with saying what must not be, and it's not generally possible to enforce such prohibitions in the compiler without actually modifying the compiler.

So if I can get everything to work, I can then think about how and where to prohibit stuff (like at runtime, or a preprocessor, or a modified compiler).

But no, in this particular case, you can't access fields by position. You can get the value of attributes by name, as per RM Pre 5a. Ditto tuples in a relation, just the 'sole tuple' operator of RM Pre 7.

Andl - A New Database Language - andl.org
Quote from dandl on May 5, 2020, 2:39 pm
Quote from Dave Voorhis on May 5, 2020, 8:22 am
Quote from dandl on May 5, 2020, 4:30 am
Quote from Dave Voorhis on May 4, 2020, 9:54 am
Quote from dandl on May 4, 2020, 1:19 am
Quote from Dave Voorhis on May 3, 2020, 3:30 pm
Quote from dandl on May 3, 2020, 2:49 pm
Quote from Dave Voorhis on May 3, 2020, 7:14 am

I didn't mean RM Pre 18. I meant OO Pre 1.

But the solution I propose completely satisfies OO Pre 1, to exactly the same extent as C#. Every expression, variable, attribute, component, argument and operator in the set of types I described earlier is checked by the compiler for safety using the native features of the C# compiler. What do you think is  missing?

Given C# representations of tuple types T1 and tuple T2, where do you statically determine whether they are type-compatible or not?

They have separate definitions. TupS is one type, TupP is a different type.

What do you mean by 'compatible'? That's not a term used in TTM.

It's conventional terminology. Loosely, type-compatibility is the opposite of a type mismatch. In TTM terms, T1 is type-compatible with type T2 if values of T1 can be assigned to a variable or parameter of type T2.

The point I wanted to make is while the target languages have various kinds of type compatibility (such as assignment compatibility, covariance and contra-variance), these are not TTM requirements. TTM has the mysterious observation that 'Tuple types TUPLE H1 and TUPLE H2 shall be equal if and only if H1 = H2', but type equality is not defined.

My reading of TTM is that two types with the same name are the same type, and two types with different names are not. This is made explicitly clear for scalar types; for non-scalar types, since the names are precisely TUPLE H and RELATION H respectively, I read it as also true. [It happens to be the case that TD allows the same type to be declared with attributes given in a different order, but they are still the same type, not just 'compatible'.]

So the for D/GP restriction is explicitly:

  • every tuple type with heading H has the same name, because there is only one.
  • every relation type with heading H has the same name, because there is only one.

This is TTM compliant.

If TupS is one type and TupP is a different type, assuming TupS is the C# equivalent to

TUPLE {x INT, y CHAR}

TUPLE {x INT, y CHAR} and TupP is the C# equivalent to

TUPLE {y CHAR, x INT}

TUPLE {y CHAR, x INT}, are they type-compatible (as I would expect them to be per TTM's structural typing)?

Or are they type-incompatible (as I would expect them to be per C# nominative typing)?

Those are user-defined scalar types, not tuple types. The corresponding tuple type is:

I'm fairly sure TUPLE {x INT, y CHAR} and TUPLE {y CHAR, x INT} are tuple types -- and likewise, RELATION {x INT, y CHAR} is a relation type -- not scalar types. CHAR and INT are scalar types.

Sorry, I think there was a bad edit in there somewhere. What I meant is that those specific syntactic forms do not illustrate structural typing. They are examples of the way that TD allows the same type to be declared with the attributes ordered differently, but both of those examples are the same type. They have the same name ('TUPLE H' as per TTM), they are one and the same type. There is only ever one type with that heading, regardless of whether it is declared in code or arises as the result of a relational operation.

TTM Pre 9 implies structural typing. "A heading H is a set of ordered pairs or attributes of the form <A, T> ..." etc. Nominative typing would be "A heading H is a named set of ordered pairs or attributes of the form <A, T> distinguished from each other by name" or similar.

It's not a big deal; I'm just curious how you've dealt with this oft-mentioned distinction between conventional programming languages like C# and, say, Tutorial D where TUPLE {x INT, y CHAR} is (type-compatible with) TUPLE {y CHAR, x INT}. If tuple type (and similarly relation type) distinction or equivalence is maintained by policy rather than mechanism, then that's reasonable and in the spirit of TTM if not confirming to the letter of TTM Pre 9. I'm sure it could be enforced but probably only at run-time, though it might be do-able at compile-time with Java annotations. I'd have to play with it to see if annotations can be used that way.

TD also has user defined scalar types, in which types with the same components can have different names and are different types.

Sorry, I find this rather baffling. You keep showing trivial (and somewhat strange) C# code examples without explanation, as if they're self-explanatory. They aren't.

I thought my questions were simple, but your responses to them suggests a different interpretation of TTM from mine.

Again, I guess I'll have to see your code when it's done.

I've refined them a little, but they're not going to change a whole lot. This is the definition for the S tuple type.

public class TupS : TupleBase {
public readonly static string[] Heading = { "SNo", "Sname", "Status", "City" };
public string Sno { get { return (string)_values[0]; } }
public string Sname { get { return (string)_values[1]; } }
public int Status { get { return (int)_values[2]; } }
public string City { get { return (string)_values[3]; } }
public TupS(string Sno, string Sname, int Status, string City) : base(
new object[] { Sno, Sname, Status, City }) {
}
}
public class TupS : TupleBase { public readonly static string[] Heading = { "SNo", "Sname", "Status", "City" }; public string Sno { get { return (string)_values[0]; } } public string Sname { get { return (string)_values[1]; } } public int Status { get { return (int)_values[2]; } } public string City { get { return (string)_values[3]; } } public TupS(string Sno, string Sname, int Status, string City) : base( new object[] { Sno, Sname, Status, City }) { } }
public class TupS : TupleBase {
  public readonly static string[] Heading = { "SNo", "Sname", "Status", "City" };

  public string Sno { get { return (string)_values[0]; } }
  public string Sname { get { return (string)_values[1]; } }
  public int Status { get { return (int)_values[2]; } }
  public string City { get { return (string)_values[3]; } }

  public TupS(string Sno, string Sname, int Status, string City) : base(
    new object[] { Sno, Sname, Status, City }) {
  }
}

Here is the S relation type.

public class RelS : RelationBase<TupS> {
public RelS(IList<TupS> tuples) : base(tuples) {
}
}
public class RelS : RelationBase<TupS> { public RelS(IList<TupS> tuples) : base(tuples) { } }
public class RelS : RelationBase<TupS> {
  public RelS(IList<TupS> tuples) : base(tuples) {
  }
}

And here is some sample data and a simple example.

public static RelS S = new RelS(
new List<TupS> {
new TupS( "S1", "Smith", 20, "London" ),
new TupS( "S2", "Jones", 10, "Paris" ),
new TupS( "S3", "Blake", 30, "Paris" ),
new TupS( "S4", "Clark", 20, "London" ),
new TupS( "S5", "Adams", 30, "Athens" ),
});
Console.WriteLine(S.Select(t => t.Status < 20));
public static RelS S = new RelS( new List<TupS> { new TupS( "S1", "Smith", 20, "London" ), new TupS( "S2", "Jones", 10, "Paris" ), new TupS( "S3", "Blake", 30, "Paris" ), new TupS( "S4", "Clark", 20, "London" ), new TupS( "S5", "Adams", 30, "Athens" ), }); Console.WriteLine(S.Select(t => t.Status < 20));
public static RelS S = new RelS(
  new List<TupS> {
    new TupS( "S1", "Smith", 20, "London" ),
    new TupS( "S2", "Jones", 10, "Paris" ),
    new TupS( "S3", "Blake", 30, "Paris" ),
    new TupS( "S4", "Clark", 20, "London" ),
    new TupS( "S5", "Adams", 30, "Athens" ),
  });

  Console.WriteLine(S.Select(t => t.Status < 20));

Pretty much all of TTM except for RM Pre 18 is already there. I'm still working on that part (as well as codifying some rules).

This looks notionally similar to the Rel internals, though syntactically different and (for example) headings are declared programmatically by adding attributes to a Heading instance, but could certainly be used in a similar manner. I don't know that I would, though -- it's rather clunky.

Where do you declare (or infer) your attribute types?

Or is that implicit in the property definitions?

Is the _values array necessary?  Wouldn't it be cleaner to just declare properties and use reflection to iterate over their member variables?

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org

It's not a big deal; I'm just curious how you've dealt with this oft-mentioned distinction between conventional programming languages like C# and, say, Tutorial D where TUPLE {x INT, y CHAR} is (type-compatible with) TUPLE {y CHAR, x INT}.

A fairly simple preprocessor can handle this by generating a public class/struct/record named Tuple_x_CHAR_y_INT, where the attribute names are in alphabetical order, and then changing all references to TUPLE types to refer to these classes.  When that's done, you have all the static typing you need, both for the tuples themselves and for their selectors.  In Scala/Kotlin, case/data classes provide some of the machinery for you.

Relation types can be handled exactly the same way, or can simply be wrapped arrays of Tuple_* class objects.

Quote from johnwcowan on May 5, 2020, 6:46 pm

It's not a big deal; I'm just curious how you've dealt with this oft-mentioned distinction between conventional programming languages like C# and, say, Tutorial D where TUPLE {x INT, y CHAR} is (type-compatible with) TUPLE {y CHAR, x INT}.

A fairly simple preprocessor can handle this by generating a public class/struct/record named Tuple_x_CHAR_y_INT, where the attribute names are in alphabetical order, and then changing all references to TUPLE types to refer to these classes.  When that's done, you have all the static typing you need, both for the tuples themselves and for their selectors.  In Scala/Kotlin, case/data classes provide some of the machinery for you.

Relation types can be handled exactly the same way, or can simply be wrapped arrays of Tuple_* class objects.

I tried that early on in the development of my Wrapd library, but names like Tuple_city_String_country_String_customerID_String_customerName_String_house_String_street_String_postcode_String were unwieldy even if rarely exposed, and this is a short example. I considered using "reference" classes with user-defined names to wrap generated classes names, or having the code generator track type equivalence and reference some internal tuple type class (which can be named anything), but that all looked to be heading in a direction more complicated than I want.

My goal is to create the simplest, minimal machinery that can make Java into a better data processing language. With better data processing taking precedence over adherence to the relational model or TTM pre/pro-scriptions (though retaining the intent of both, if that makes sense), tuple classes can be given user-provided names, tuple type equivalence per TTM's structural non-scalar typing isn't important, and it's more useful to integrate with the Java Streams API than create a strictly relational alternative.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
Quote from johnwcowan on May 5, 2020, 6:46 pm

It's not a big deal; I'm just curious how you've dealt with this oft-mentioned distinction between conventional programming languages like C# and, say, Tutorial D where TUPLE {x INT, y CHAR} is (type-compatible with) TUPLE {y CHAR, x INT}.

A fairly simple preprocessor can handle this by generating a public class/struct/record named Tuple_x_CHAR_y_INT, where the attribute names are in alphabetical order, and then changing all references to TUPLE types to refer to these classes.  When that's done, you have all the static typing you need, both for the tuples themselves and for their selectors.  In Scala/Kotlin, case/data classes provide some of the machinery for you.

Relation types can be handled exactly the same way, or can simply be wrapped arrays of Tuple_* class objects.

I'm pretty sure that isn't going to work beyond toy examples.

Take a query that wants attributes SNO and CITY, and is to be tolerant of other attributes in the relation it runs over. (That is, others that might be alphabetically before/between/after the two of interest.) Or even that takes CITY and some other attribute <parameter> to plot a geo-spread: where are our Suppliers, where are our Products, where are our Customers, ...? The <parameter> attribute's name might be alphabetically before or after CITY: Agency.

What's the heading/type for the 'input' to that query? Or do you say you can't give a type until you know which specific relvar it's applied to? Or do you say such a query can't be expressed?

I want to say such a query can be expressed as polymorphic; because I want to type-check the body of the query to stop it (for example) trying to access attributes it hasn't asked for in its signature, or treat them as numeric when the signature says they're at CHAR.

I also want to use this polymorphism instead of TTM's Selector structures/RM Pre 4 -- because Selectors need positional access; and Selectors have type-differentiable names, which cuts across the set-of-<A, T> semantics. Or perhaps I mean I want Selector structures to be implemented as non-positional TVAs, with an atomic name. Something like:

TYPE ALIAS Point = TUP{X RAT, Y RAT};  // ALIAS means not a distinct/new type, but a shorthand

TYPE ALIAS Circle = TUP{Rad RAT, Centre Point}  // Point TVA is nested inside Circle TVA

Or rather I want to get nearer structural typing than that:

TYPE ALIAS Point(r) = TUP{X RAT, T RAT | r}       // the r is a type parameter denoting 'other attributes', the | says 'unioned', not positionally appended

TYPE ALIAS Circle(r) = Point( {Rad RAT | r} )         // a Circle is a Point extended with a Rad and possibly more 'other attributes'

Note that that unlike the first definition, this Circle is 'flat'/the Point is not nested, it's extended. 'Applying' Point is something like macro expansion(?)

When we get to 'apply' that polymorphic type (or 'type schema') to some relvar, type inference firstly validates the heading includes Rad, X, Y and they're all at RAT; then figures out the 'other attributes' and completes r. Note the query might be returning a relation value with all the r attributes EXTEND with Area RAT. Then we need some type-level polymorphism mechanism to say r must not already include Area.

Then how do I do the "CITY and some other attribute <parameter>"? That's the bit I haven't figured out yet. The query wants a relation of type REL{ CITY CHAR | r }, I supply an argument {SNO INT | rr} (note there's no REL, TUP there) I 'apply' the REL{ ... } schema to the SNO argument, yielding REL{ CITY CHAR, SNO INT | rr}; I apply that to an actual relvar/relation value (presumably S/ possibly already filtered); the compiler fills in rr := {SNAME CHAR, STATUS INT} and away we go.

It becomes D-like, maybe. I'm afraid I find your descriptions of how you're making C# into a D more baffling with each post. Now it looks to me like you're treating D compliance as a sort of word game, where you achieve D compliance by choosing the right set of words to describe it rather than using programming language constructs to implement it.

Now you've lost me. What is 'D-like'? TTM says:

D: The Manifesto makes repeated reference to a hypothetical language it calls D. However, the name D is
merely a useful generic label; any language that conforms to the principles laid down in the Manifesto is a
valid D (and any language that fails so to conform is not a valid D).

I say that the proposal I put forward shows how a GP language can be made to largely conform to the principles laid down. What more would it need to do?

D-like would be meeting some pre/pro-scriptions but not others, but in the spirit of D if not the letter.

So that's a good thing. But I plan to satisfy all the Pre required features for D and its type system natively, within a GP language (currently C#). That's a better thing.

And I expect that it will be possible to meet all the Pre and Pro prohibitions by runtime checks and/or a preprocessor and/or modifying the compiler. That's a fully TTM compliant D.

[Note that the Pre requirements include a few that are really Pro, such as RM Pre 3a and d. They may have to go in the second category.]

I guess I'll have to wait to see the result, because the explanations aren't making sense to me.

As for efforts to create a new language, I see no reason to deprecate them, or to position efforts to turn an existing general-purpose language into a D or a D-like thing either above or below them. They're all equally worthwhile efforts, though like all choices in IT, each approach gains you some things and takes away others. Everything has its trade-offs.

I'm not deprecating, I'm saying that there have been several successful implementations and no-one wants to use them. If there is to be a D that others use, this is a way that might get there.

I get a reasonable number of Rel downloads and some forks on GitHub. I'm happy with what it's achieved, which is better than I expected when I started working on it.

But if you're talking about mass adoption, I think you need to consider what dramatic improvement a D of any kind will make over and above existing alternatives. Otherwise, it will only attract the usual niche audience of experimenters, hobbyists, and database obsessives like ourselves.

To make a market dent a D needs to make a dramatic difference. That's the problem with all the D implementations so far; they're perfectly adequate but none are disruptive. They don't make a big enough difference in day-to-day programming to unseat SQL or (facilities within) popular programming languages.

No, that isn't how it works. People take on new products and services that fill a pressing need or that solve a real problem. If you create the solution first and then go chase the problem you will likely fail. If you first find a real need/problem, then create something that will fill it/solve it, you have at least a chance to get to the next step, which is to figure out how to get people to try it and use it. That's how it works. Startups 101.

I think there is unmet need in performing SQL-like operations on non-SQL data. I am in the phase of experimenting with different possible solutions for that problem, and only that problem. What pressing need or problem are you solving?

By way of analogy, people aren't willing to give up their old family Ford when all we're offering is the same car with a different engine that delivers the same fuel economy as the old one, and the grill's a pretty shape.

True. But when they get a family and have to carry kids and bikes and football gear they have a new problem, and they solve that by getting rid of the old sedan and buying an SUV. And when Dad gets a promotion and a fat bonus and needs to show off some status he solves that problem by getting rid of the Ford and buying a BMW or a Mercedes.That's how it works.

 

 

Andl - A New Database Language - andl.org

TTM Pre 9 implies structural typing. "A heading H is a set of ordered pairs or attributes of the form <A, T> ..." etc. Nominative typing would be "A heading H is a named set of ordered pairs or attributes of the form <A, T> distinguished from each other by name" or similar.

I don't think so. TTM Pre 9 makes it clear that headings are a set of attributes, so the order of attributes does not matter, and since headings are not a type, they have no name.

But RM Pre 6 is the one that defines a tuple type, and that says "the name of that type shall be, precisely, TUPLE H." So a tuple type does have a name, and for any given heading H there is only that one tuple type.

So the issue of structural/nominative does not arise, because (a) the tuple type is named but (b) there is only one with that heading.

I thik TD confuses things by not making it clear that tuple types as defined by {X,Y}, {Y,X}, {X,Y,Z ALLBUT Z} and {X,Y,W} JOIN {X,Y,Z} are not 'structurally equivalent', they are all exactly the same unique type.

It's not a big deal; I'm just curious how you've dealt with this oft-mentioned distinction between conventional programming languages like C# and, say, Tutorial D where TUPLE {x INT, y CHAR} is (type-compatible with) TUPLE {y CHAR, x INT}. If tuple type (and similarly relation type) distinction or equivalence is maintained by policy rather than mechanism, then that's reasonable and in the spirit of TTM if not confirming to the letter of TTM Pre 9. I'm sure it could be enforced but probably only at run-time, though it might be do-able at compile-time with Java annotations. I'd have to play with it to see if annotations can be used that way.

The rules for defining a tuple type are easy:

  • a tuple type must define a unique heading
    • the heading is an array of strings, which enables generic RA machinery
    • there may not be two tuple types with the same heading
  • inherits from a base type (which ensures it behaves as value (equals() and hashcode()))
  • must declare getters, setters and a constructor that match the heading and comply with RM Pre 4,5,6
  • must be immutable
public static RelS S = new RelS(
  new List<TupS> {
    new TupS( "S1", "Smith", 20, "London" ),
    new TupS( "S2", "Jones", 10, "Paris" ),
    new TupS( "S3", "Blake", 30, "Paris" ),
    new TupS( "S4", "Clark", 20, "London" ),
    new TupS( "S5", "Adams", 30, "Athens" ),
  });

  Console.WriteLine(S.Select(t => t.Status < 20));

Pretty much all of TTM except for RM Pre 18 is already there. I'm still working on that part (as well as codifying some rules).

This looks notionally similar to the Rel internals, though syntactically different and (for example) headings are declared programmatically by adding attributes to a Heading instance, but could certainly be used in a similar manner. I don't know that I would, though -- it's rather clunky.

The code required to declare a tuple type is similar in volume to the declaration of a POCO. The getters and constructor are roughly the same; the heading is an extra bit of code; on the plus side, you don't have to code for Equals() and HashCode() because they get inherited. And like POCOs, they can easily be generated.

The code required to declare a relation type is trivial: just the class name and a typed constructor.

The clunky bit is that you have to declare every type. RA operators generate new types behind the scenes in TD/Rel, but in C# you have to declare every tuple type and every relation type up front. That's definitely something that would benefit from code generation.

There is an interesting problem in TTM, which I completely missed until now. The very prescriptive language in RM Pre 6 talks about 'defining...operators of that type'. Then it talks about 'the applicable operators shall include...'. (The references to 'RENAME,project,etc' are obviously mistakes left over from an earlier iteration, but no matter.)

The problem is "(b) an operator for extracting a specified attribute value from a specified tuple (the tuple in question might be required to be of degree one—see RM Prescription 9".  It's hard to see exactly what is intended here, or what will satisfy it, but I don't think as written it's what was intended. I think having getters is enough.

Where do you declare (or infer) your attribute types?

Or is that implicit in the property definitions?

Is the _values array necessary?  Wouldn't it be cleaner to just declare properties and use reflection to iterate over their member variables?

The attribute types are defined by the constructor, and (indirectly) by the getters. The constructor guarantees type safety on construction, which is important. Storing the values in an array is a mechanism I chose specifically to avoid reflection in the generic relation operators. But yes, you are correct, this could all be done using POCOs and reflection, and I may even go back to doing that.

The key insight is just this one: forget 'structural equivalence', this is about uniqueness. There shall be only one tuple type and one relation type defined for any given heading, where a heading is a set of strings (attribute names). The rest is just coding.

Andl - A New Database Language - andl.org
Quote from AntC on May 5, 2020, 11:10 pm
Quote from johnwcowan on May 5, 2020, 6:46 pm

It's not a big deal; I'm just curious how you've dealt with this oft-mentioned distinction between conventional programming languages like C# and, say, Tutorial D where TUPLE {x INT, y CHAR} is (type-compatible with) TUPLE {y CHAR, x INT}.

A fairly simple preprocessor can handle this by generating a public class/struct/record named Tuple_x_CHAR_y_INT, where the attribute names are in alphabetical order, and then changing all references to TUPLE types to refer to these classes.  When that's done, you have all the static typing you need, both for the tuples themselves and for their selectors.  In Scala/Kotlin, case/data classes provide some of the machinery for you.

Relation types can be handled exactly the same way, or can simply be wrapped arrays of Tuple_* class objects.

Take a query that wants attributes SNO and CITY, and is to be tolerant of other attributes in the relation it runs over. ... Or even that takes CITY and some other attribute <parameter> to plot a geo-spread: where are our Suppliers, where are our Products, where are our Customers, ...? The <parameter> attribute's name might be alphabetically before or after CITY: Agency.

What's the heading/type for the 'input' to that query? Or do you say you can't give a type until you know which specific relvar it's applied to? Or do you say such a query can't be expressed?

What I've sketched goes a long way beyond TTM's Pres and Pros. There are more things in type systems than are dreamt of in RM Pre 1, Horatio. We can perhaps say there are parts of the type system in the implementation/host language that are not visible in the D, but I think it would be hard to hermetically seal them off and I'm not going to try.

I want to say such a query can be expressed as polymorphic; because I want to type-check the body of the query to stop it (for example) trying to access attributes it hasn't asked for in its signature, or treat them as numeric when the signature says they're at CHAR.

I also want to use this polymorphism instead of TTM's Selector structures/RM Pre 4 -- because Selectors need positional access; and Selectors have type-differentiable names, which cuts across the set-of-<A, T> semantics. Or perhaps I mean I want Selector structures to be implemented as non-positional TVAs, with an atomic name. Something like:

TYPE ALIAS Point = TUP{X RAT, Y RAT};  // ALIAS means not a distinct/new type, but a shorthand

TYPE ALIAS Circle = TUP{Rad RAT, Centre Point}  // Point TVA is nested inside Circle TVA

Or rather I want to get nearer structural typing than that:

TYPE ALIAS Point(r) = TUP{X RAT, Y RAT | r}       // the r is a type parameter denoting 'other attributes', the | says 'unioned', not positionally appended

TYPE ALIAS Circle(r) = Point( {Rad RAT | r} )         // a Circle is a Point extended with a Rad and possibly more 'other attributes'

Note that that unlike the first definition, this Circle is 'flat'/the Point is not nested, it's extended. 'Applying' Point is something like macro expansion(?)

TUP{X RAT, Y RAT} is a TTM (nonscalar) type; RAT is a TTM scalar type. They're both inhabited by values, per RM Pre 1. Point(r) is not a TTM type because it's not inhabited by values -- at least not until there's an argument supplied for r. What to supply is not a TTM value; and it's not even a TTM type. I'll write that definition for Point more explicitly (so what I put previously was a shorthand):

TYPE ALIAS Point(r) = TUP( {X RAT, Y RAT} ∪ r )

  • Firstly, Point( ) is a parametric type, sometimes called a 'type scheme'.
  • {X RAT, Y RAT} is a Heading, notTTM tuple type. Headings are sets, subsets of sets are sets, so subsets of Headings are Headings. Headings are neither scalar types nor nonscalar types because they're not inhabited by values. We might say (ref RM Pre 1) there are (many) more types than omega that are not inhabited by values, and for which we cannot give an example value. Or we say Headings are not types, then some other term needed.
  • Then r is also a Heading; also not inhabited by values.
  • TUP( ) is a parametric type, in principle no different to Point( ). TUP( ) takes a Heading and returns a TTM (nonscalar) type -- i.e. inhabitable by values. TTM wants to call that a 'type generator'. Point( ) takes a Heading and returns a TTM (nonscalar) type, via a 'subroutine call' to TUP( ).
  • There's another category of uninhabited types in {X RAT, Y RAT}: RAT is a familiar, value-inhabited TTM scalar type. X, Y and the ordered-pairing of them denoted by juxtaposition in Tutorial D are neither scalars nor non-scalars (because not inhabitable) neither are they Headings, they're elements of Headings.
  • So the ordered pair type scheme < , > is a type generator; it takes an Attribute-name type (which is neither scalar nor nonscalar, not inhabitable) and a familiar TTM type (scalar or nonscalar) and returns an element of a Heading.
  • What I've arrived at is type-of-type categories, these are mutually exclusive:
    • scalar types;
    • nonscalar types;
    • Headings; -- sets of
    • <A, T> types; of which
    • A Attribute-name types.
    • In general, 'type generators' (or type-level functions) can take arguments of any type-of-type and return a result of any (not necessarily different) type-of-type. Static type-checking applies for Type generators just as much as for value-level functions; so type arguments must be well- type-of-typed.

A demo from Hugs/Trex (which uses different syntax vs Tutorial D). As well as equivalents to the Point( ), Circle( ) above :

type TColour  = (colour :: String)                    -- type ALIAS is type-of-type Heading; attrib names must be lower-case

myColouredCircle :: TCircle TColour                   -- type signature for the tuplevar; type-level 'apply' is by juxtaposition
myColouredCircle = (x = 1.2, colour = "Red", y = 3.4, radius = 27)      -- type-correct; note attributes in no particular order

Main> #colour myColouredCircle                        -- #colour is a function to get attribute colour from a tuple
====> "Red"

Going this far in Hugs/Trex is definitely pushing its boundaries as an 'experimental' feature. Type inference is wobbly. Nesting calls to 'type generators' is wobbly -- as with TCircle TColour = TPoint ((radius :: Float) | (colour :: String)). The type-of-type idea is not fully worked out; parametricity just gives up at (what currently seem like) arbitrary points.

Quote from johnwcowan on May 5, 2020, 6:46 pm

It's not a big deal; I'm just curious how you've dealt with this oft-mentioned distinction between conventional programming languages like C# and, say, Tutorial D where TUPLE {x INT, y CHAR} is (type-compatible with) TUPLE {y CHAR, x INT}.

A fairly simple preprocessor can handle this by generating a public class/struct/record named Tuple_x_CHAR_y_INT, where the attribute names are in alphabetical order, and then changing all references to TUPLE types to refer to these classes.  When that's done, you have all the static typing you need, both for the tuples themselves and for their selectors.  In Scala/Kotlin, case/data classes provide some of the machinery for you.

The exact class name and layout don't matter, the key thing is that there is a unique tuple type for each heading, one and only one.

Using named attributes prevents any kind of generic operations of the RA, unless you want to do it all with reflection. Whatever mechanism you choose, you need named access for getters and some relational operators, and generic access for relational operators. I chose 'array of object' for that purpose.

Relation types can be handled exactly the same way, or can simply be wrapped arrays of Tuple_* class objects.

You need more than that, because relation types also (a) have to be unique by heading (b) have to be value types. It's easy enough to do that in C# using a generic base class based on the tuple type.

Andl - A New Database Language - andl.org

What I've sketched goes a long way beyond TTM's Pres and Pros. There are more things in type systems than are dreamt of in RM Pre 1, Horatio. We can perhaps say there are parts of the type system in the implementation/host language that are not visible in the D, but I think it would be hard to hermetically seal them off and I'm not going to try.

I like big ambitions. I'm just not sure what the goal is.

I want to say such a query can be expressed as polymorphic; because I want to type-check the body of the query to stop it (for example) trying to access attributes it hasn't asked for in its signature, or treat them as numeric when the signature says they're at CHAR.

Agreed.

I also want to use this polymorphism instead of TTM's Selector structures/RM Pre 4 -- because Selectors need positional access; and Selectors have type-differentiable names, which cuts across the set-of-<A, T> semantics. Or perhaps I mean I want Selector structures to be implemented as non-positional TVAs, with an atomic name. Something like:

TYPE ALIAS Point = TUP{X RAT, Y RAT};  // ALIAS means not a distinct/new type, but a shorthand

TYPE ALIAS Circle = TUP{Rad RAT, Centre Point}  // Point TVA is nested inside Circle TVA

Or rather I want to get nearer structural typing than that:

TYPE ALIAS Point(r) = TUP{X RAT, Y RAT | r}       // the r is a type parameter denoting 'other attributes', the | says 'unioned', not positionally appended

TYPE ALIAS Circle(r) = Point( {Rad RAT | r} )         // a Circle is a Point extended with a Rad and possibly more 'other attributes'

Note that that unlike the first definition, this Circle is 'flat'/the Point is not nested, it's extended. 'Applying' Point is something like macro expansion(?)

TUP{X RAT, Y RAT} is a TTM (nonscalar) type; RAT is a TTM scalar type. They're both inhabited by values, per RM Pre 1. Point(r) is not a TTM type because it's not inhabited by values -- at least not until there's an argument supplied for r. What to supply is not a TTM value; and it's not even a TTM type. I'll write that definition for Point more explicitly (so what I put previously was a shorthand):

TYPE ALIAS Point(r) = TUP( {X RAT, Y RAT} ∪ r )

  • Firstly, Point( ) is a parametric type, sometimes called a 'type scheme'.
  • {X RAT, Y RAT} is a Heading, notTTM tuple type. Headings are sets, subsets of sets are sets, so subsets of Headings are Headings. Headings are neither scalar types nor nonscalar types because they're not inhabited by values. We might say (ref RM Pre 1) there are (many) more types than omega that are not inhabited by values, and for which we cannot give an example value. Or we say Headings are not types, then some other term needed.
  • Then r is also a Heading; also not inhabited by values.
  • TUP( ) is a parametric type, in principle no different to Point( ). TUP( ) takes a Heading and returns a TTM (nonscalar) type -- i.e. inhabitable by values. TTM wants to call that a 'type generator'. Point( ) takes a Heading and returns a TTM (nonscalar) type, via a 'subroutine call' to TUP( ).

I'm OK with all this. But I would point out that for any given heading, TTM RM Pre 6 will always generate the same (unique) type named 'TUPLE H'. Do you plan something different? Is there such a thing as 'type equality' other than name equality in Haskell?

  • There's another category of uninhabited types in {X RAT, Y RAT}: RAT is a familiar, value-inhabited TTM scalar type. X, Y and the ordered-pairing of them denoted by juxtaposition in Tutorial D are neither scalars nor non-scalars (because not inhabitable) neither are they Headings, they're elements of Headings.
  • So the ordered pair type scheme < , > is a type generator; it takes an Attribute-name type (which is neither scalar nor nonscalar, not inhabitable) and a familiar TTM type (scalar or nonscalar) and returns an element of a Heading.
  • What I've arrived at is type-of-type categories, these are mutually exclusive:
    • scalar types;
    • nonscalar types;
    • Headings; -- sets of
    • <A, T> types; of which
    • A Attribute-name types.
    • In general, 'type generators' (or type-level functions) can take arguments of any type-of-type and return a result of any (not necessarily different) type-of-type. Static type-checking applies for Type generators just as much as for value-level functions; so type arguments must be well- type-of-typed.

Those elements can all be identified in TTM and in TD, but the implementation is good old slogging compiler code. Do you get something better/different by doing it in Haskell?

 

Andl - A New Database Language - andl.org
PreviousPage 3 of 6Next