The Forum for Discussion about The Third Manifesto and Related Matters

Please or Register to create posts and topics.

Clarification re local relvar and KEY

PreviousPage 2 of 5Next
Quote from Erwin on June 3, 2020, 9:20 pm
Quote from Dave Voorhis on June 3, 2020, 3:38 pm

... changed from a mandatory KEY to an optional KEY where its absence is equivalent to KEY {ALL BUT}. I'm sure I had some good reason for not implementing it in Rel at the time but I don't recall what that might be.

Count of available hands vs. perceived immediately materialisable benefits, no doubt.

Possibly. I usually try to have better reasons for things than that, but as I don't recall the reason, I suppose it could be anything.

I've added it to my TODO list.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org

Like much of Tutorial D, it's intended for illustration, so if behaviour isn't obvious by illustration or otherwise explicitly specified, it's up to the implementer.

I get that, but surely somewhere there was room for a hint? In any case, it seems a rather unuseful thing. And...

It serves the usual purpose of a KEY constraint, doesn't it?

And thus should exhibit the usual behaviour, no?

Well, maybe not quite. There is a broad expectation that updates to database relvars are bundled into transactions and a constraint violation of any kind might well not happen until the transaction is committed. This makes for a simple exception-free mechanism for detecting and coding to the possibility of errors.

But if an assignment to a local variable can trigger an immediate error, then the language implementor is forced to introduce an error-handling mechanism just to handle that case. Or alternatively, the assignment might fail silently. I'm just surprised this was included 'with the rations' but with no hint as to the intended purpose or behaviour.

Now that you mention it, I dimly recall that at some point in the evolution of Tutorial D it changed from a mandatory KEY to an optional KEY where its absence is equivalent to KEY {ALL BUT}. I'm sure I had some good reason for not implementing it in Rel at the time but I don't recall what that might be.

I like to break up complex multi-line expressions (or those that use a value twice) into assignments to throw-away local variables, to aid readability. I think mandatory 'PRIVATE' and 'KEY' detract from that.

I'm only concerned with headings and the implications of MR Pre 18. AFAICT, headings are mostly constructed from names and values, not types. Although RM Pre 9 (and other places) mentions declared attribute type, the types in question are mostly not declared, they are inferred from values as supplied. Two exceptions I can see: empty relations and as part of an operator declaration.

Headings turn up a lot in the RA, but as (attribute) names, not types. Think rename, project, join, union, etc. But again, where a new attribute is added (extend, summarize), the type is inferred from the value, not set by a declaration.

I'm not sure whether this leads anywhere, just thinking out loud.

Types appear to be inferred where possible; types are manifest where not.

Which sounds like a good general principle, but again I'm just focused on headings. When you drop the focus on 'declared type' you get something like:

  • A heading is a set of attribute names.
  • A tuple value has a heading; each (named) attribute has a value of some type.
  • A relation value has a tuple heading and a body, a set of tuples all matching the heading
  • Every relational operator has zero, one or two relation value arguments, zero or one heading arguments, and returns a relation value.
  • Each operator has its own rules for argument compatibility and returned heading inference.

So on this light the RENAME and PROJECT operators have a heading argument, being the attribute names mentioned in the syntactical form. EXTEND and WHERE likewise have a heading argument comprising the attributes mentioned in the computation (the actual computation is moved out into a relcon). And so on. The syntactical form disguises the headings, but if you simply take all the attribute names mentioned in some relational operator invocation, they comprise the heading argument for that operator.

The syntax could be simplified a lot if all the operators were put into the same consistent form

Andl - A New Database Language - andl.org
Quote from dandl on June 4, 2020, 1:49 am

Like much of Tutorial D, it's intended for illustration, so if behaviour isn't obvious by illustration or otherwise explicitly specified, it's up to the implementer.

I get that, but surely somewhere there was room for a hint? In any case, it seems a rather unuseful thing. And...

It serves the usual purpose of a KEY constraint, doesn't it?

And thus should exhibit the usual behaviour, no?

Well, maybe not quite. There is a broad expectation that updates to database relvars are bundled into transactions and a constraint violation of any kind might well not happen until the transaction is committed. This makes for a simple exception-free mechanism for detecting and coding to the possibility of errors.

But if an assignment to a local variable can trigger an immediate error, then the language implementor is forced to introduce an error-handling mechanism just to handle that case. Or alternatively, the assignment might fail silently. I'm just surprised this was included 'with the rations' but with no hint as to the intended purpose or behaviour.

It never came up as an example, and it's language-specific behaviour rather than conceptual or impactful on the general model, so from a TTM point of view, it's implementation-dependent.

Now that you mention it, I dimly recall that at some point in the evolution of Tutorial D it changed from a mandatory KEY to an optional KEY where its absence is equivalent to KEY {ALL BUT}. I'm sure I had some good reason for not implementing it in Rel at the time but I don't recall what that might be.

I like to break up complex multi-line expressions (or those that use a value twice) into assignments to throw-away local variables, to aid readability. I think mandatory 'PRIVATE' and 'KEY' detract from that.

PRIVATE and KEY are notionally akin to modifiers. Assigning them to throw-away variables would be more like attempting to assign 'private' or 'static' to a C# variable to aid readability than assigning values to variables for later reference.

I'm only concerned with headings and the implications of MR Pre 18. AFAICT, headings are mostly constructed from names and values, not types. Although RM Pre 9 (and other places) mentions declared attribute type, the types in question are mostly not declared, they are inferred from values as supplied. Two exceptions I can see: empty relations and as part of an operator declaration.

Headings turn up a lot in the RA, but as (attribute) names, not types. Think rename, project, join, union, etc. But again, where a new attribute is added (extend, summarize), the type is inferred from the value, not set by a declaration.

I'm not sure whether this leads anywhere, just thinking out loud.

Types appear to be inferred where possible; types are manifest where not.

Which sounds like a good general principle, but again I'm just focused on headings. When you drop the focus on 'declared type' you get something like:

  • A heading is a set of attribute names.
  • A tuple value has a heading; each (named) attribute has a value of some type.
  • A relation value has a tuple heading and a body, a set of tuples all matching the heading
  • Every relational operator has zero, one or two relation value arguments, zero or one heading arguments, and returns a relation value.
  • Each operator has its own rules for argument compatibility and returned heading inference.

So on this light the RENAME and PROJECT operators have a heading argument, being the attribute names mentioned in the syntactical form. EXTEND and WHERE likewise have a heading argument comprising the attributes mentioned in the computation (the actual computation is moved out into a relcon). And so on. The syntactical form disguises the headings, but if you simply take all the attribute names mentioned in some relational operator invocation, they comprise the heading argument for that operator.

The syntax could be simplified a lot if all the operators were put into the same consistent form

I'm sure there are ways Tutorial D syntax could be simplified -- which we've discussed before, and some of which are implemented in Rel -- but each operator is quite different. PROJECT expects just an attribute name list. RENAME expects a specification of attribute name 'old name' / 'new name' pairs with wildcards. EXTEND expects a set of name / expression pairs. WHERE expects a boolean expression. Each makes implicit reference to the heading of its relation (or tuple) operand, but the operand isn't a heading -- the heading is part of the operand tuple or relation (type).

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
Quote from dandl on June 4, 2020, 1:49 am

The syntactical form disguises the headings, but if you simply take all the attribute names mentioned in some relational operator invocation, they comprise the heading argument for that operator.

The syntax could be simplified a lot if all the operators were put into the same consistent form

The only thing that allows you to keep spouting such nonsense is your willful denial/persistent ignorance of the fact that headings are ***MORE*** than just a set of attribute names.

The ***only*** universe in which your notions are valid, is the one in which it is made ***IMPOSSIBLE*** to ***ANY USER ANYWHERE ANYTIME*** to introduce an attribute, say while using an EXTEND, if any other user in the same universe has introduced that same attribute name before with some other type.  Consider what it would take to achieve that.  I've considered going there and very quickly decided I wouldn't.

Quote from Erwin on June 4, 2020, 7:20 pm
Quote from dandl on June 4, 2020, 1:49 am

The syntactical form disguises the headings, but if you simply take all the attribute names mentioned in some relational operator invocation, they comprise the heading argument for that operator.

The syntax could be simplified a lot if all the operators were put into the same consistent form

The only thing that allows you to keep spouting such nonsense is your willful denial/persistent ignorance of the fact that headings are ***MORE*** than just a set of attribute names.

[By my reading, language such as this should attract a warning from our friendly moderator.]

So you really did miss the point, and why I came at it this way. The entire thrust of my argument is that headings should not be 'more than a set of attribute names.' Headings should be the way to define the set of names, but attribute types are almost always inferred from where and how the heading is used.

The ***only*** universe in which your notions are valid, is the one in which it is made ***IMPOSSIBLE*** to ***ANY USER ANYWHERE ANYTIME*** to introduce an attribute, say while using an EXTEND, if any other user in the same universe has introduced that same attribute name before with some other type.  Consider what it would take to achieve that.  I've considered going there and very quickly decided I wouldn't.

Not at all. Values  are always of some type, so a relation with a heading must also have a type for each attribute, usually inferred from the literal or EXTEND that created it. I'm only talking about a heading entity in isolation, that can be used for various purposes including as an argument to an RA operator.

Andl - A New Database Language - andl.org

Which sounds like a good general principle, but again I'm just focused on headings. When you drop the focus on 'declared type' you get something like:

  • A heading is a set of attribute names.
  • A tuple value has a heading; each (named) attribute has a value of some type.
  • A relation value has a tuple heading and a body, a set of tuples all matching the heading
  • Every relational operator has zero, one or two relation value arguments, zero or one heading arguments, and returns a relation value.
  • Each operator has its own rules for argument compatibility and returned heading inference.

So on this light the RENAME and PROJECT operators have a heading argument, being the attribute names mentioned in the syntactical form. EXTEND and WHERE likewise have a heading argument comprising the attributes mentioned in the computation (the actual computation is moved out into a relcon). And so on. The syntactical form disguises the headings, but if you simply take all the attribute names mentioned in some relational operator invocation, they comprise the heading argument for that operator.

The syntax could be simplified a lot if all the operators were put into the same consistent form

I'm sure there are ways Tutorial D syntax could be simplified -- which we've discussed before, and some of which are implemented in Rel -- but each operator is quite different. PROJECT expects just an attribute name list. RENAME expects a specification of attribute name 'old name' / 'new name' pairs with wildcards. EXTEND expects a set of name / expression pairs. WHERE expects a boolean expression. Each makes implicit reference to the heading of its relation (or tuple) operand, but the operand isn't a heading -- the heading is part of the operand tuple or relation (type).

(a) this is not about syntax of one language (b) we haven't discussed this approach before (c) actually, they're all much the same if you do it this way.

Project takes a heading argument of attributes you want to keep; its parallel REMOVE takes a heading argument of those you want to discard.

S { S#, SNAME, STATUS }

PROJECT(S, {S#,SNAME,STATUS})

REMOVE(S, {CITY})

RENAME takes a heading argument of old names and new names, and some way of telling which is which, or two headings (old and new). Say:

a RENAME ( X1 AS Y1 , X2 AS Y2)

RENAME(a, {X1,Y1}, {X2,Y2})

EXTEND takes a heading argument of the arguments and return value of some function (or relcon if you prefer). It might look like this:

EXTEND S ADD ( 3 * STATUS AS TRIPLE ) // old syntax? (taken from p40 of DTATRM)

EXTEND(S, {STATUS,TRIPLE}, s => 3 * s)

EXTEND(S, {STATUS,TRIPLE}, triple) // triple is a unary function defined elsewhere

WHERE likewise:

S := S WHERE NOT ( CITY = 'Athens' ) ;

S := WHERE(S, {CITY}, c => c != 'Athens')

S := S.WHERE({CITY}, c => c != 'Athens') // dotted syntax stacks better

var S = S.WHERE("CITY", c => c != "Athens") // C#

Instead of having a hotch-potch of different forms and reserved words (which I for one can never remember), every operator in the RA has exactly the same form, with  relation value and/or heading and/or function args.

I'm not planning on rewriting Andl (or Rel) to use this syntax. Rather, it shows how similar this can be to a conventional programming language such as C#/Java. In this syntax it can implement D in any such language using strings as headings and runtime checking. With a simple pre-processor it can do the compile time checking and inference too.

 

Andl - A New Database Language - andl.org
Quote from dandl on June 5, 2020, 6:03 am

Which sounds like a good general principle, but again I'm just focused on headings. When you drop the focus on 'declared type' you get something like:

  • A heading is a set of attribute names.
  • A tuple value has a heading; each (named) attribute has a value of some type.
  • A relation value has a tuple heading and a body, a set of tuples all matching the heading
  • Every relational operator has zero, one or two relation value arguments, zero or one heading arguments, and returns a relation value.
  • Each operator has its own rules for argument compatibility and returned heading inference.

So on this light the RENAME and PROJECT operators have a heading argument, being the attribute names mentioned in the syntactical form. EXTEND and WHERE likewise have a heading argument comprising the attributes mentioned in the computation (the actual computation is moved out into a relcon). And so on. The syntactical form disguises the headings, but if you simply take all the attribute names mentioned in some relational operator invocation, they comprise the heading argument for that operator.

The syntax could be simplified a lot if all the operators were put into the same consistent form

I'm sure there are ways Tutorial D syntax could be simplified -- which we've discussed before, and some of which are implemented in Rel -- but each operator is quite different. PROJECT expects just an attribute name list. RENAME expects a specification of attribute name 'old name' / 'new name' pairs with wildcards. EXTEND expects a set of name / expression pairs. WHERE expects a boolean expression. Each makes implicit reference to the heading of its relation (or tuple) operand, but the operand isn't a heading -- the heading is part of the operand tuple or relation (type).

(a) this is not about syntax of one language (b) we haven't discussed this approach before (c) actually, they're all much the same if you do it this way.

Project takes a heading argument of attributes you want to keep; its parallel REMOVE takes a heading argument of those you want to discard.

S { S#, SNAME, STATUS }

PROJECT(S, {S#,SNAME,STATUS})

REMOVE(S, {CITY})

RENAME takes a heading argument of old names and new names, and some way of telling which is which, or two headings (old and new). Say:

a RENAME ( X1 AS Y1 , X2 AS Y2)

RENAME(a, {X1,Y1}, {X2,Y2})

EXTEND takes a heading argument of the arguments and return value of some function (or relcon if you prefer). It might look like this:

EXTEND S ADD ( 3 * STATUS AS TRIPLE ) // old syntax? (taken from p40 of DTATRM)

EXTEND(S, {STATUS,TRIPLE}, s => 3 * s)

EXTEND(S, {STATUS,TRIPLE}, triple) // triple is a unary function defined elsewhere

WHERE likewise:

S := S WHERE NOT ( CITY = 'Athens' ) ;

S := WHERE(S, {CITY}, c => c != 'Athens')

S := S.WHERE({CITY}, c => c != 'Athens') // dotted syntax stacks better

var S = S.WHERE("CITY", c => c != "Athens") // C#

Instead of having a hotch-potch of different forms and reserved words (which I for one can never remember), every operator in the RA has exactly the same form, with  relation value and/or heading and/or function args.

I'm not planning on rewriting Andl (or Rel) to use this syntax. Rather, it shows how similar this can be to a conventional programming language such as C#/Java. In this syntax it can implement D in any such language using strings as headings and runtime checking. With a simple pre-processor it can do the compile time checking and inference too.

I must be missing some essential point, because (a) increasingly I realise I don't understand what you're suggesting or why it's a good thing, and (b) each of your examples seems to have gone from simple to peculiar. E.g., why does the C# 'WHERE' example need a string argument of "CITY"? Etc.

Also, RENAME should support wildcards. Don't forget the wildcards.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
Quote from Dave Voorhis on June 5, 2020, 8:00 am
Quote from dandl on June 5, 2020, 6:03 am

Which sounds like a good general principle, but again I'm just focused on headings. When you drop the focus on 'declared type' you get something like:

  • A heading is a set of attribute names.
  • A tuple value has a heading; each (named) attribute has a value of some type.
  • A relation value has a tuple heading and a body, a set of tuples all matching the heading
  • Every relational operator has zero, one or two relation value arguments, zero or one heading arguments, and returns a relation value.
  • Each operator has its own rules for argument compatibility and returned heading inference.

So on this light the RENAME and PROJECT operators have a heading argument, being the attribute names mentioned in the syntactical form. EXTEND and WHERE likewise have a heading argument comprising the attributes mentioned in the computation (the actual computation is moved out into a relcon). And so on. The syntactical form disguises the headings, but if you simply take all the attribute names mentioned in some relational operator invocation, they comprise the heading argument for that operator.

The syntax could be simplified a lot if all the operators were put into the same consistent form

I'm sure there are ways Tutorial D syntax could be simplified -- which we've discussed before, and some of which are implemented in Rel -- but each operator is quite different. PROJECT expects just an attribute name list. RENAME expects a specification of attribute name 'old name' / 'new name' pairs with wildcards. EXTEND expects a set of name / expression pairs. WHERE expects a boolean expression. Each makes implicit reference to the heading of its relation (or tuple) operand, but the operand isn't a heading -- the heading is part of the operand tuple or relation (type).

(a) this is not about syntax of one language (b) we haven't discussed this approach before (c) actually, they're all much the same if you do it this way.

Project takes a heading argument of attributes you want to keep; its parallel REMOVE takes a heading argument of those you want to discard.

S { S#, SNAME, STATUS }

PROJECT(S, {S#,SNAME,STATUS})

REMOVE(S, {CITY})

RENAME takes a heading argument of old names and new names, and some way of telling which is which, or two headings (old and new). Say:

a RENAME ( X1 AS Y1 , X2 AS Y2)

RENAME(a, {X1,Y1}, {X2,Y2})

EXTEND takes a heading argument of the arguments and return value of some function (or relcon if you prefer). It might look like this:

EXTEND S ADD ( 3 * STATUS AS TRIPLE ) // old syntax? (taken from p40 of DTATRM)

EXTEND(S, {STATUS,TRIPLE}, s => 3 * s)

EXTEND(S, {STATUS,TRIPLE}, triple) // triple is a unary function defined elsewhere

WHERE likewise:

S := S WHERE NOT ( CITY = 'Athens' ) ;

S := WHERE(S, {CITY}, c => c != 'Athens')

S := S.WHERE({CITY}, c => c != 'Athens') // dotted syntax stacks better

var S = S.WHERE("CITY", c => c != "Athens") // C#

Instead of having a hotch-potch of different forms and reserved words (which I for one can never remember), every operator in the RA has exactly the same form, with  relation value and/or heading and/or function args.

I'm not planning on rewriting Andl (or Rel) to use this syntax. Rather, it shows how similar this can be to a conventional programming language such as C#/Java. In this syntax it can implement D in any such language using strings as headings and runtime checking. With a simple pre-processor it can do the compile time checking and inference too.

I must be missing some essential point, because (a) increasingly I realise I don't understand what you're suggesting or why it's a good thing, and (b) each of your examples seems to have gone from simple to peculiar. E.g., why does the C# 'WHERE' example need a string argument of "CITY"? Etc.

Also, RENAME should support wildcards. Don't forget the wildcards.

I can only show you regular simplified syntax and how it maps to C#. I can't make you see it's a good thing.

The C# example expresses headings as strings because they're a convenient data type for the purpose. The example shows an anonymous function in lambda form acting as a relcon; the heading names the attributes. Here is a longer one.

var S = S.WHERE("CITY,STATUS", (a1,a2) => a1 != "Athens" && a2 >= 20 && a2 <= 30); // C#

Andl - A New Database Language - andl.org
Quote from dandl on June 5, 2020, 2:33 pm
Quote from Dave Voorhis on June 5, 2020, 8:00 am
Quote from dandl on June 5, 2020, 6:03 am

Which sounds like a good general principle, but again I'm just focused on headings. When you drop the focus on 'declared type' you get something like:

  • A heading is a set of attribute names.
  • A tuple value has a heading; each (named) attribute has a value of some type.
  • A relation value has a tuple heading and a body, a set of tuples all matching the heading
  • Every relational operator has zero, one or two relation value arguments, zero or one heading arguments, and returns a relation value.
  • Each operator has its own rules for argument compatibility and returned heading inference.

So on this light the RENAME and PROJECT operators have a heading argument, being the attribute names mentioned in the syntactical form. EXTEND and WHERE likewise have a heading argument comprising the attributes mentioned in the computation (the actual computation is moved out into a relcon). And so on. The syntactical form disguises the headings, but if you simply take all the attribute names mentioned in some relational operator invocation, they comprise the heading argument for that operator.

The syntax could be simplified a lot if all the operators were put into the same consistent form

I'm sure there are ways Tutorial D syntax could be simplified -- which we've discussed before, and some of which are implemented in Rel -- but each operator is quite different. PROJECT expects just an attribute name list. RENAME expects a specification of attribute name 'old name' / 'new name' pairs with wildcards. EXTEND expects a set of name / expression pairs. WHERE expects a boolean expression. Each makes implicit reference to the heading of its relation (or tuple) operand, but the operand isn't a heading -- the heading is part of the operand tuple or relation (type).

(a) this is not about syntax of one language (b) we haven't discussed this approach before (c) actually, they're all much the same if you do it this way.

Project takes a heading argument of attributes you want to keep; its parallel REMOVE takes a heading argument of those you want to discard.

S { S#, SNAME, STATUS }

PROJECT(S, {S#,SNAME,STATUS})

REMOVE(S, {CITY})

RENAME takes a heading argument of old names and new names, and some way of telling which is which, or two headings (old and new). Say:

a RENAME ( X1 AS Y1 , X2 AS Y2)

RENAME(a, {X1,Y1}, {X2,Y2})

EXTEND takes a heading argument of the arguments and return value of some function (or relcon if you prefer). It might look like this:

EXTEND S ADD ( 3 * STATUS AS TRIPLE ) // old syntax? (taken from p40 of DTATRM)

EXTEND(S, {STATUS,TRIPLE}, s => 3 * s)

EXTEND(S, {STATUS,TRIPLE}, triple) // triple is a unary function defined elsewhere

WHERE likewise:

S := S WHERE NOT ( CITY = 'Athens' ) ;

S := WHERE(S, {CITY}, c => c != 'Athens')

S := S.WHERE({CITY}, c => c != 'Athens') // dotted syntax stacks better

var S = S.WHERE("CITY", c => c != "Athens") // C#

Instead of having a hotch-potch of different forms and reserved words (which I for one can never remember), every operator in the RA has exactly the same form, with  relation value and/or heading and/or function args.

I'm not planning on rewriting Andl (or Rel) to use this syntax. Rather, it shows how similar this can be to a conventional programming language such as C#/Java. In this syntax it can implement D in any such language using strings as headings and runtime checking. With a simple pre-processor it can do the compile time checking and inference too.

I must be missing some essential point, because (a) increasingly I realise I don't understand what you're suggesting or why it's a good thing, and (b) each of your examples seems to have gone from simple to peculiar. E.g., why does the C# 'WHERE' example need a string argument of "CITY"? Etc.

Also, RENAME should support wildcards. Don't forget the wildcards.

I can only show you regular simplified syntax and how it maps to C#. I can't make you see it's a good thing.

The C# example expresses headings as strings because they're a convenient data type for the purpose. The example shows an anonymous function in lambda form acting as a relcon; the heading names the attributes. Here is a longer one.

var S = S.WHERE("CITY,STATUS", (a1,a2) => a1 != "Athens" && a2 >= 20 && a2 <= 30); // C#

I still don't get it.

Can't you make it something like this, assuming S has a static tuple structure?

var result = S.WHERE(t => t.City != "Athens" && t.Status >= 20 && t.Status <= 30)

And something like this if S has a dynamic (i.e., runtime loaded) tuple structure?

var result = S.WHERE(t => t.getString("City") != "Athens" && t.getInt("Status") >= 20 && t.getInt("Status") <= 30)

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
Quote from Dave Voorhis on June 5, 2020, 3:45 pm
Quote from dandl on June 5, 2020, 2:33 pm
Quote from Dave Voorhis on June 5, 2020, 8:00 am
Quote from dandl on June 5, 2020, 6:03 am

Which sounds like a good general principle, but again I'm just focused on headings. When you drop the focus on 'declared type' you get something like:

  • A heading is a set of attribute names.
  • A tuple value has a heading; each (named) attribute has a value of some type.
  • A relation value has a tuple heading and a body, a set of tuples all matching the heading
  • Every relational operator has zero, one or two relation value arguments, zero or one heading arguments, and returns a relation value.
  • Each operator has its own rules for argument compatibility and returned heading inference.

So on this light the RENAME and PROJECT operators have a heading argument, being the attribute names mentioned in the syntactical form. EXTEND and WHERE likewise have a heading argument comprising the attributes mentioned in the computation (the actual computation is moved out into a relcon). And so on. The syntactical form disguises the headings, but if you simply take all the attribute names mentioned in some relational operator invocation, they comprise the heading argument for that operator.

The syntax could be simplified a lot if all the operators were put into the same consistent form

I'm sure there are ways Tutorial D syntax could be simplified -- which we've discussed before, and some of which are implemented in Rel -- but each operator is quite different. PROJECT expects just an attribute name list. RENAME expects a specification of attribute name 'old name' / 'new name' pairs with wildcards. EXTEND expects a set of name / expression pairs. WHERE expects a boolean expression. Each makes implicit reference to the heading of its relation (or tuple) operand, but the operand isn't a heading -- the heading is part of the operand tuple or relation (type).

(a) this is not about syntax of one language (b) we haven't discussed this approach before (c) actually, they're all much the same if you do it this way.

Project takes a heading argument of attributes you want to keep; its parallel REMOVE takes a heading argument of those you want to discard.

S { S#, SNAME, STATUS }

PROJECT(S, {S#,SNAME,STATUS})

REMOVE(S, {CITY})

RENAME takes a heading argument of old names and new names, and some way of telling which is which, or two headings (old and new). Say:

a RENAME ( X1 AS Y1 , X2 AS Y2)

RENAME(a, {X1,Y1}, {X2,Y2})

EXTEND takes a heading argument of the arguments and return value of some function (or relcon if you prefer). It might look like this:

EXTEND S ADD ( 3 * STATUS AS TRIPLE ) // old syntax? (taken from p40 of DTATRM)

EXTEND(S, {STATUS,TRIPLE}, s => 3 * s)

EXTEND(S, {STATUS,TRIPLE}, triple) // triple is a unary function defined elsewhere

WHERE likewise:

S := S WHERE NOT ( CITY = 'Athens' ) ;

S := WHERE(S, {CITY}, c => c != 'Athens')

S := S.WHERE({CITY}, c => c != 'Athens') // dotted syntax stacks better

var S = S.WHERE("CITY", c => c != "Athens") // C#

Instead of having a hotch-potch of different forms and reserved words (which I for one can never remember), every operator in the RA has exactly the same form, with  relation value and/or heading and/or function args.

I'm not planning on rewriting Andl (or Rel) to use this syntax. Rather, it shows how similar this can be to a conventional programming language such as C#/Java. In this syntax it can implement D in any such language using strings as headings and runtime checking. With a simple pre-processor it can do the compile time checking and inference too.

I must be missing some essential point, because (a) increasingly I realise I don't understand what you're suggesting or why it's a good thing, and (b) each of your examples seems to have gone from simple to peculiar. E.g., why does the C# 'WHERE' example need a string argument of "CITY"? Etc.

Also, RENAME should support wildcards. Don't forget the wildcards.

I can only show you regular simplified syntax and how it maps to C#. I can't make you see it's a good thing.

The C# example expresses headings as strings because they're a convenient data type for the purpose. The example shows an anonymous function in lambda form acting as a relcon; the heading names the attributes. Here is a longer one.

var S = S.WHERE("CITY,STATUS", (a1,a2) => a1 != "Athens" && a2 >= 20 && a2 <= 30); // C#

I still don't get it.

Can't you make it something like this, assuming S has a static tuple structure?

var result = S.WHERE(t => t.City != "Athens" && t.Status >= 20 && t.Status <= 30)

And something like this if S has a dynamic (i.e., runtime loaded) tuple structure?

var result = S.WHERE(t => t.getString("City") != "Athens" && t.getInt("Status") >= 20 && t.getInt("Status") <= 30)

No. The whole point is: this implementation relies on headings and not types. The first implementation I did (RelValue) was strongly typed and relied on declared tuple and relation types.You had to declare all the tuple types with attribute names and types, and then provided the heading matched the declaration it was fully type safe. However, it turns out that (a) it's a real pain to have to declare all those types and (b) you have to declare extra 'phantom' types (and headings) if you want to do things like REMOVE. Yes, with this implementation you can write something exactly like your suggestions.

This second implementation (RelNode) is weakly typed. Every tuple value is the same type, you don't have to declare anything (for the RA). All the attribute values are strongly typed of course, but there are no individual tuple types. Heading inference is done and checked at runtime. There are no 'open expressions' in restrict, extend and aggregate; they are replaced by App-A relcons defined by headings.

Yes, I'm still working on how best to implement the anonymous functions that represent relcons. Your suggestions are not possible, even in principle. The code you see in the example below uses casts, but please note: this is defining a relcon, not an 'open expression'. The supplied function sees the heading tuple, not the raw tuple from the outer scope. It's all about the headings.

There is a near-perfect mapping from TD as shown above, based on using headings as operator arguments. This is a full D (as per the Pre requirements), but is poor at picking up errors at compile time. For that, it needs a pre-processor. The nice thing is that it would output high quality debuggable C# (or Java, or anything you like).

I don't have a good sample right now, but this example of GTC gives the flavour. Please note that this is a complete sample. It will compile and run exactly as shown here, with the RelNode library. It is type safe, but it will throw runtime errors if the headings are wrong. You can see the direct correspondence with the code in DTATRM.

 var mmqi = RelNode.Import(SourceKind.Csv, ".", "MMQ", "MajorPNo:text,MinorPNo:text,Qty:number");
 var mmexp = mmqi
    .Rename("Qty,ExpQty")
    .While(TupWhile.F(tw => tw
      .Rename("MinorPNo,zmatch")
      .Compose(mmqi.Rename("MajorPNo,zmatch"))
      .Extend("Qty,ExpQty,ExpQty", TupExtend.F(v => (decimal)v[0] * (decimal)v[1]))
      .Remove("Qty")));
  var mmagg = mmexp
     .Aggregate("ExpQty,TotQty", TupAggregate.F((v, a) => (decimal)v + (decimal)a)));

 

Andl - A New Database Language - andl.org
PreviousPage 2 of 5Next