The Forum for Discussion about The Third Manifesto and Related Matters

Please or Register to create posts and topics.

Update to Tutorial D, July 2020

PreviousPage 5 of 9Next
Quote from Dave Voorhis on July 17, 2020, 5:30 pm
Quote from Hugh on July 17, 2020, 1:44 pm
Quote from Dave Voorhis on July 16, 2020, 1:31 pm
Quote from Hugh on July 16, 2020, 11:56 am

Somewhere in the plethora of recent posts in this thread Dave Voorhis said that he removed WITH support on statements in Rel when we deleted it from Tutorial D.  But that deletion took place the day I started this thread and I had previously found that Rel lacked it.  That's why I said I had been encouraged by Rel's lack of support for it.

I beg to differ. It's at the top of page 122 of my paper copy of DTATRM, and roughly the middle of page 127 in DTATRM.pdf:

  • The <with> statement has been dropped and WITH expressions have been clarified.

I remember being a tad disappointed, and there might have been discussion about it on the forum at the time, but I removed it in accordance with DTATRM.

That <with> statement was merely a statement defining one or more local constants, as opposed to a WITH clause that appears as part of a statement.

Support for a WITH statement that defines one or more local constants is certainly implementable. Rel used to have one. I've made a note to put it back in.

And I would have thought its semantics are clear too:

WITH { <assigns> } : <statements>   // desugars to

BEGIN <assigns> ; <statements> END     // BEGIN/END needed to limit scope of introduced names

Except that those <assigns> targets are constants, not variables. In fact I liked Dave's putative CONSTANT v := e; earlier in this thread, as a more generally useful feature. (CONSTANT would also be useful to declare DEE, DUM, just in case anybody was in danger of treating those as relvars.)

Quote from Hugh on July 17, 2020, 1:54 pm
Quote from AntC on July 16, 2020, 12:20 pm
Quote from Hugh on July 16, 2020, 11:33 am
Quote from AntC on July 16, 2020, 6:17 am
Quote from Hugh on July 9, 2020, 11:22 am

Here's a response from Chris Date:

Here's a trivial example of the kind of thing I had in mind: 

EXTEND S :

{ WITH ( temp := ( SP MATCHING S ) WHERE QTY > 500 AND NOT ( PNO = 'P1' )  ) :

A := SUM ( temp , QTY ) ,

B := MAX ( temp , QTY ) ,

C := MIN ( temp , QTY ) ,

D := COUNT ( temp ) }

Obviously the definition of temp could be arbitrarily complex in practice.

Obviously, temp is in scope for each of the assigns, overriding any existing definition under normal block-scoping rules.

 

Ah, here's a maybe better example/reworking of Chris's:

EXTEND S :
{ WITH ( temp := ( SP MATCHING REL{ TUP{ S# S# } } ) WHERE QTY > 500 AND NOT ( PNO = 'P1' ) ) :
A := SUM ( temp , QTY ) ,
B := MAX ( temp , QTY ) ,
C := MIN ( temp , QTY ) ,
D := COUNT ( temp ) }
EXTEND S : { WITH ( temp := ( SP MATCHING REL{ TUP{ S# S# } } ) WHERE QTY > 500 AND NOT ( PNO = 'P1' ) ) : A := SUM ( temp , QTY ) , B := MAX ( temp , QTY ) , C := MIN ( temp , QTY ) , D := COUNT ( temp ) }
[snip]

( SP MATCHING REL{ TUP{ S# S# } } ) isn't a valid expression.  The second appearance of S# is undefined (unless a variable of that name is in scope).  You seem to be imagining that the second operand of a dyadic relational operator invocation can reference attributes of the first (a mistake I have made myself before now).  Try putting those operands the other way around if you don't see what I mean.

 

> You seem to be imagining that the second operand of a dyadic relational operator invocation can reference attributes of the first

Thanks Hugh but no, that wouldn't have achieved the logic I was trying for. (What seems to you would have been an identity operation on SP.) I want/expect the free S# to denote the attribute name in the outer S. I expected (didn't imagine) that in an 'open expression' within the body of EXTEND S, free names (i.e. S#) would get bound to attribute names introduced by the target of the EXTEND. That is, even if there were a variable of that name in scope, it would get shadowed by the attribute names introduced via S. Similarly in SUM( temp, QTY), apparently, Tutorial D expects that free names in the exp (second argument to SUM, and note that it can be an exp, not just an <attribute ref>) bind to attribute names introduced by the relexp (first argument).

The r.h. operand of MATCHING (or other dyadic relops) isn't bound by attribute names from the l.h. operand. If that MATCHING were a JOIN, I'd expect I could happily put those operands the other way round.

OK If what I showed doesn't work, how do I achieve the intent of 'recalculating' temp for each tuple in S that's getting EXTENDed? What Dave shows is perhaps the only way? That is

EXTEND S :
{ temp := ( SP MATCHING REL{ TUP{ S# S# } } ) WHERE QTY > 500 AND NOT ( PNO = 'P1' ) ,
A := SUM ( temp , QTY ) ,
B := MAX ( temp , QTY ) ,
C := MIN ( temp , QTY ) ,
D := COUNT ( temp )
} {ALL BUT temp}
EXTEND S : { temp := ( SP MATCHING REL{ TUP{ S# S# } } ) WHERE QTY > 500 AND NOT ( PNO = 'P1' ) , A := SUM ( temp , QTY ) , B := MAX ( temp , QTY ) , C := MIN ( temp , QTY ) , D := COUNT ( temp ) } {ALL BUT temp}
EXTEND S :
{ temp := ( SP MATCHING REL{ TUP{ S# S# } } ) WHERE QTY > 500 AND NOT ( PNO = 'P1' ) ,
  A := SUM ( temp , QTY ) ,
  B := MAX ( temp , QTY ) ,
  C := MIN ( temp , QTY ) ,
  D := COUNT ( temp ) 
} {ALL BUT temp}

Or will that not work either? Why not?

Where does the Tutorial D spec tell us which syntactic constructs introduce attribute names into scope; and what is the textual extent of that scope in the 'body' of the construct?

See the definition of <extend> on page 24 of the latest version at the TTM website.

Your revised example looks valid to me.  The second appearance of S# is an attribute reference denoting that value of that attribute in the "current tuple" of S, the relation being extended.  Such attribute references are also permitted in SUMMARIZE and WHERE.

Thanks Hugh but I'm now thoroughly confused. If my revised example is valid, why isn't the form I gave using WITH? I can't see anything on page 24 (in the new or the non-revised text) legislates against it. Let me put it again:

EXTEND S :
{ WITH ( temp := ( SP MATCHING REL{ TUP{ S# S# } } ) WHERE QTY > 500 AND NOT ( PNO = 'P1' ) ) :
  A := SUM ( temp , QTY ) ,
  B := MAX ( temp , QTY ) ,
  C := MIN ( temp , QTY ) ,
  D := COUNT ( temp ) }
Why does the presence of the WITH within the body of EXTEND make it invalid on rhs of the WITH-bound assignment to refer to attribute names coming from the relexp that the expression is extending?
Page 24 that you reference says "The individual <name intro>s in NIC [the <name intro commalist>] are evaluated in sequence as written. "
By my reading, that means evaluated before carrying out any of the attribute-assigns in the body of the EXTEND following the WITH prefix. Therefore any attribute names appearing in the NIC must be referencing the relexp-to-extend. And that seems sensible to notionally refer to 'the current tuple' from the relexp-to-extend.
(To contrast vs how page 24 doesn't specify it, and for obvious reasons: it might say don't evaluate the assigns within the WITH but substitute the rhs expression into the attribute-assigns whenever the <introduced name> appears. That would indeed risk name capture.)
Page 24 goes on to say "The individual <attribute assign>s are effectively executed in parallel. " (that text hasn't changed). And that's fine: per tuple being extended, evaluate the WITH-constants first, then execute the attribute-assigns. So the "in parallel" means this is not like a Multiple Assignment executed in sequence of appearance.
Indeed having now seen that "executed in parallel" I'm wondering if that semantics invalidates Dave's suggestion under some circumstances: suppose temp is an existing attribute; then the attribute-assigns to A, B, C, D can execute in parallel possibly using the 'old' value of temp (or some using the old, some using the new). At some point (not specificied) the 'old' value of temp gets overwritten by Dave's attribute-assign. Dave's code than projects it away, leaving no explanation for why the values assigned to A, B, C, D are now inconsistent.
Quote from dandl on July 17, 2020, 3:12 pm

Your revised example looks valid to me.  The second appearance of S# is an attribute reference denoting that value of that attribute in the "current tuple" of S, the relation being extended.  Such attribute references are also permitted in SUMMARIZE and WHERE.

This concept of 'current tuple' I find highly suspect.

That Hugh puts the term in scare quotes shows that D&D aren't very happy with it. It does smell too much of procedural/algorithmic/implementation concerns. Then chiefly I think we need some other term -- perhaps you could suggest one.

All we are supposed to know is that the attribute names actually appearing in the open expression are bound to certain values, and the result of the expression will be used for some purpose. If we have an open expression (for the S relation) such as STATUS*10, then the computation need only be done 3 times (there are only 3 unique values), so which is the 'current tuple' for these 3 occasions?

I think the cardinality of the type or of the actually appearing distinct values is not relevant. "computation need only be done 3 times" is surely even worse of an implementation/optimisation concern than the smell of 'current tuple'. In general there's some calculation to perform for each tuple, which might reference any/all of the attributes.

I much prefer a set-oriented view in which the entire calculation is performed simultaneously, rather than the iteration implied by 'current tuple'.

Not "simultaneously" surely. I think you mean potentially applying to each tuple in parallel; and atomically wrt the visible semantics in Tutorial D. That is, no expression can get to see a state of some relvar in which some tuples have been processed, some not. So the semantics is to apply the 'open expression' per tuple via map/reduce/filter/fold semantics -- note I'm not prescribing an algorithm there.

So instead of 'open expression': 'per-tuple mapping expression'? 'anonymous tuple function'?

RM Pre 6 prescribes tuple operators "analogous to ... [various] ... operators of the relational algebra". Perhaps we should turn that round to prescribe a D shall be able to apply tuple operators atomically to all the tuples of some relation, returning a new relation value. Hmm hmm attribute remove /{ALL BUT ...} might yield identical tuples, that need to get rationalised, in the reduce step of map-reduce.

Quote from AntC on July 18, 2020, 6:33 am

 

That Hugh puts the term in scare quotes shows that D&D aren't very happy with it. It does smell too much of procedural/algorithmic/implementation concerns. Then chiefly I think we need some other term -- perhaps you could suggest one.

 

I think the cardinality of the type or of the actually appearing distinct values is not relevant. "computation need only be done 3 times" is surely even worse of an implementation/optimisation concern than the smell of 'current tuple'. In general there's some calculation to perform for each tuple, which might reference any/all of the attributes.

 

 

Indeed.  That some kind of "relation.tuples.foreach(...)" must necessarily be in play is just inevitable *because* a relation is a set.  And what "comes out" of the "foreach" is precisely what "current tuple" refers to.

That some people, after yrs of being around, still fail to see that despite how reminiscent such a spec is of algorithmics, it is *not* a prescription of an algortihm, merely a prescription of which results must be obtained, is regrettable.

The stuff wrt "computation must only be done three times" reminds me of those people who claim there's redundancy in a design "because the number three occurs more than once".  The cost of determining the optimisation must be balanced against the savings to be had from finding it.

Quote from dandl on July 18, 2020, 12:55 am

I can't find mention of WITH clauses on statements in DTATRM, though I admit I gave up searching through instances of 'with' because my annoying PDF viewer ignores case distinctions and punctuation. I couldn't find it in the DTATRM grammar either, though I might be staring right at it and can't see it for looking.

Adobe Acrobat does a perfectly fine job of searching.

Of searching, it did.

My complaint was an inside wry quip to myself. I had Adobe Acrobat until an update made it consume vast amounts of CPU, spool up the fans and gobble battery. I uninstalled it and unless someone convinces me that it's all better now -- and that someone probably ought to be an Adobe employee, and as I don't know anyone at Adobe... Though now that I think about it, my other half Nikki has a friend whose husband works at Adobe, and I have a former student who (last I heard) worked at Adobe, so I do know people at Adobe, but...

...I digress.

There are 59 occurrences of WITH, and a reference to <with> statement on p127, nothing else I could find. In fact, I couldn't find any overview explanation of what it does, but the low level spec is on p101.

Yes, p101 gives the semantics of name introduction and p127 notes the removal of the <with> statement, but I couldn't find anything about a WITH clause on statements. But, I did ask if it was added later -- not that it would matter, because it was subsequently removed as was announced in this thread.

And so, yes, I find this on page 156 (PDF version; per the printed page number) of DBE:

<with statement body> ::= WITH ( <name intro commalist> ) : <statement body>

Let WSB be a <with statement body>, and let NIC and SB be the <name intro commalist> and the <statement body>, respectively, immediately contained in WSB. The individual <name intro>s in NIC are evaluated in sequence as written. By definition, each such <name intro> immediately contains an <introduced name> and an <exp>. Let NI be one of those <name intro>s, and let the <introduced name> and the <exp> immediately contained in NI be N and X, respectively. Then N denotes the value obtained by evaluating X, and it can appear subsequently in WSB wherever the expression (X)—i.e., X in parentheses—would be allowed.

The grammar rule also appears on page 168.

I couldn't find any examples using it, but I didn't look very hard.

I'm not sure why I didn't implement it, other than simply forgetting to do so. I'm sure I would have intended at the time to ask if this was simply bringing back <with> statements (which had been removed in DTATRM) and if not, was there a semantic difference -- and why couldn't <statement body> include a BEGIN ... END block, and where is a <previously defined statement body commalist> defined (it's referenced by <nonwith statement body>, referenced by <statement body>), and so forth -- but got distracted by something and didn't.

Anyway, no matter now, because it's been removed. I will implement the simple WITH statement (which, as I recall, could have a BEGIN ... END block) as a Rel extension and see if there are any issues. I don't recall that there were any when I first implemented it.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
Quote from AntC on July 18, 2020, 2:28 am
Quote from Dave Voorhis on July 17, 2020, 5:30 pm
Quote from Hugh on July 17, 2020, 1:44 pm
Quote from Dave Voorhis on July 16, 2020, 1:31 pm
Quote from Hugh on July 16, 2020, 11:56 am

Somewhere in the plethora of recent posts in this thread Dave Voorhis said that he removed WITH support on statements in Rel when we deleted it from Tutorial D.  But that deletion took place the day I started this thread and I had previously found that Rel lacked it.  That's why I said I had been encouraged by Rel's lack of support for it.

I beg to differ. It's at the top of page 122 of my paper copy of DTATRM, and roughly the middle of page 127 in DTATRM.pdf:

  • The <with> statement has been dropped and WITH expressions have been clarified.

I remember being a tad disappointed, and there might have been discussion about it on the forum at the time, but I removed it in accordance with DTATRM.

That <with> statement was merely a statement defining one or more local constants, as opposed to a WITH clause that appears as part of a statement.

Support for a WITH statement that defines one or more local constants is certainly implementable. Rel used to have one. I've made a note to put it back in.

And I would have thought its semantics are clear too:

WITH { <assigns> } : <statements>   // desugars to

BEGIN <assigns> ; <statements> END     // BEGIN/END needed to limit scope of introduced names

Except that those <assigns> targets are constants, not variables. In fact I liked Dave's putative CONSTANT v := e; earlier in this thread, as a more generally useful feature. (CONSTANT would also be useful to declare DEE, DUM, just in case anybody was in danger of treating those as relvars.)

Yes. A CONSTANT definition -- equivalent to VAR but immutable -- makes sense.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
Quote from dandl on July 18, 2020, 12:33 am
Quote from Dave Voorhis on July 17, 2020, 6:44 pm
Quote from dandl on July 17, 2020, 11:30 am

Where does the Tutorial D spec tell us which syntactic constructs introduce attribute names into scope; and what is the textual extent of that scope in the 'body' of the construct?

The spec might have benefited from a phrase along the lines of, "Unless otherwise specified, semantics and scope should be assumed to essentially be those of well-known, conventional imperative programming languages such as C, PL/I, and Pascal."

I've picked languages that were well-known at the time the initial Tutorial D specs were written, of course. Java was brand new or still soon-to-be, C# didn't exist yet, Python still too obscure to mention, JavaScript should never be mentioned except disparagingly, and so forth.

The "unless otherwise specified" means "open expressions", for which some description of lambda expressions -- which they effectively are -- might be helpful now, though at the time the first Tutorial D specs were written probably wouldn't generally have been any more familiar than "open expressions."

There was (and is) absolutely nothing on the planet quite like 'open expressions'. You can model your general rules for operators and blocks on Algol 60 or any of its descendants, but when you implemented open expressions you were Robinson Crusoe, first man on the island. Yes, Lisp had lambda functions and various languages since have appropriated that term (I dislike it, but we're stuck with it), but they have little in common. It's not a block, it has no local scope, it's not a closure, it has no well-defined arguments, it has no well-defined surrounding scope. It's a bit of a bastard.

This is one of the reasons university computer science departments love running a final-year "comparative paradigms" or other programming language comparison course/module: you can make one reasonably contentious statement like, "the <x> feature in language <y> is actually lambda expressions disguised" in the first ten minutes of the first class of week 1 and be guaranteed to get a full term's worth of student/teacher debate out of it. No other preparation needed.

So, yes, "open expressions" are and aren't lambda expressions. In Rel, they are lambda expressions internally, plus some magic to avoid requiring explicit parameter declarations and the like. It works a bit like this:

Imagine a typical WHERE -- like S WHERE STATUS = 20 -- implemented in a conventional modern imperative programming language like Java, C#, whatever. Assuming s is a language variable containing a relation value from the canonical 'S' relvar, it would probably be something like this:

s.where(tuple => tuple.STATUS == 20)

In other words, the variable s is an instance of a class (probably Relation) with a method where that accepts a lambda expression that returns a boolean and defines a single parameter called tuple that represents the current tuple (cue reference to, but not discussion about, another post in this thread about "current tuples".)

The where method iterates the tuples in s, passing each to the specified lambda function for evaluation. For each tuple passed in via tuple, it tests whether tuple's STATUS attribute is 20 or not, returning true if it is and false if not. The where method emits a new Relation instance containing tuples where the lambda function returned true.

Now we just do syntactic, but not semantic, tweakery. Remove the dot after s, eliminate the tuple parameter definition because it's assumed, eliminate the tuple parameter reference because it's also assumed, remove needless parens, and use a single = to represent an equality test. We get:

s where STATUS = 20

Employ Tutorial D case conventions:

S WHERE STATUS = 20

Oh, look, we're right where we started. It was a lambda expression all along. (A lambda in open expression's clothing, so to speak.)

Yes, I get that you can approach it that way. Yes, anonymous functions are sufficiently powerful that, coupled with a suitable tuple iterator, and some assumptions about scope they can be used to implement open expressions in a language that uses them. Just because you can rewrite a Java lambda as an OE doesn't mean that that the reverse holds and that open expressions are lambdas. In fact the equivalent construct in LINQ and Andl targeting an SQL back end does not use anything like them.

Assume you started from a different place. Perhaps you have a language with operations as per the RA and an underlying engine that does operations on relations (possibly in parallel) but does not give you an iterator. Perhaps the scalar computation is very expensive and you need to minimise its use, or do it in batches. Perhaps your backend is SQL. As long as you are fixated on open expressions and iterators, you close off possibilities.

Those are implementation concerns. There is nothing that requires an implementation to invoke the lambda expression via a tuple iterator, only that it must semantically treat the expression as being evaluated for each tuple.

Regardless how you actually implement them, as they appear in Tutorial D the semantics are lambda expressions, and the syntax is pared-down conventional lambda expressions.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
Quote from dandl on July 17, 2020, 3:12 pm

Your revised example looks valid to me.  The second appearance of S# is an attribute reference denoting that value of that attribute in the "current tuple" of S, the relation being extended.  Such attribute references are also permitted in SUMMARIZE and WHERE.

This concept of 'current tuple' I find highly suspect. All we are supposed to know is that the attribute names actually appearing in the open expression are bound to certain values, and the result of the expression will be used for some purpose. If we have an open expression (for the S relation) such as STATUS*10, then the computation need only be done 3 times (there are only 3 unique values), so which is the 'current tuple' for these 3 occasions?

I much prefer a set-oriented view in which the entire calculation is performed simultaneously, rather than the iteration implied by 'current tuple'.

Thanks, dandl.  I completely agree, and that's why I put the ugly term in quotes.  I've never worked out a decent way of expressing the construct, apart perhaps from using the Appendix A method (via JOIN).

I like your observation concerning the number of distinct results.

Hugh

Coauthor of The Third Manifesto and related books.
Quote from Dave Voorhis on July 17, 2020, 5:30 pm
Quote from Hugh on July 17, 2020, 1:44 pm
Quote from Dave Voorhis on July 16, 2020, 1:31 pm
Quote from Hugh on July 16, 2020, 11:56 am

Somewhere in the plethora of recent posts in this thread Dave Voorhis said that he removed WITH support on statements in Rel when we deleted it from Tutorial D.  But that deletion took place the day I started this thread and I had previously found that Rel lacked it.  That's why I said I had been encouraged by Rel's lack of support for it.

I beg to differ. It's at the top of page 122 of my paper copy of DTATRM, and roughly the middle of page 127 in DTATRM.pdf:

  • The <with> statement has been dropped and WITH expressions have been clarified.

I remember being a tad disappointed, and there might have been discussion about it on the forum at the time, but I removed it in accordance with DTATRM.

That <with> statement was merely a statement defining one or more local constants, as opposed to a WITH clause that appears as part of a statement.

The <with> statement that defined one or more local constants was what I thought you meant by "WITH clauses on statements."

I didn't know there was a WITH clause that appears as part of a statement, which would account for why I didn't implement it.

That statements with WITH clauses -- as opposed to a WITH statement -- ever existed comes as a surprise to me. It explains why the wording of the first post did seem a tad odd. I thought a WITH statement was what you meant.

Where was it defined and/or demonstrated?

I can't find mention of WITH clauses on statements in DTATRM, though I admit I gave up searching through instances of 'with' because my annoying PDF viewer ignores case distinctions and punctuation. I couldn't find it in the DTATRM grammar either, though I might be staring right at it and can't see it for looking.

Unless it was added later?

Perhaps we have been over-cautious on both counts.  If either or both of these constructs are successfully implemented (in Rel for example), I suppose we could reconsider.  I encourage you to have a go!

Support for a WITH statement that defines one or more local constants is certainly implementable. Rel used to have one. I've made a note to put it back in.

Sorry for lacking the energy to attempt an answer to your historical questions.

The effect of a standalone WITH statement can be obtained by using a WITH clause on a compound statement, clearly defining the scope of the introduced names.  I think I prefer that to a standalone WITH statement, where the scope is presumably the same as that of a local variable, and I assume that such a scope is the remaining portion of the block in which it appears (i.e., an introduced name isn't retrospectively in scope).  I suppose it could be argued that it's neat for a name not to be defined until it's really needed, but then the system should disallow a name that clashes with a name already defined in the same block.

Hugh

Hugh

Coauthor of The Third Manifesto and related books.
Quote from AntC on July 18, 2020, 2:59 am
Quote from Hugh on July 17, 2020, 1:54 pm
Quote from AntC on July 16, 2020, 12:20 pm
Quote from Hugh on July 16, 2020, 11:33 am
Quote from AntC on July 16, 2020, 6:17 am
Quote from Hugh on July 9, 2020, 11:22 am

Here's a response from Chris Date:

Here's a trivial example of the kind of thing I had in mind: 

EXTEND S :

{ WITH ( temp := ( SP MATCHING S ) WHERE QTY > 500 AND NOT ( PNO = 'P1' )  ) :

A := SUM ( temp , QTY ) ,

B := MAX ( temp , QTY ) ,

C := MIN ( temp , QTY ) ,

D := COUNT ( temp ) }

Obviously the definition of temp could be arbitrarily complex in practice.

Obviously, temp is in scope for each of the assigns, overriding any existing definition under normal block-scoping rules.

 

Ah, here's a maybe better example/reworking of Chris's:

EXTEND S :
{ WITH ( temp := ( SP MATCHING REL{ TUP{ S# S# } } ) WHERE QTY > 500 AND NOT ( PNO = 'P1' ) ) :
A := SUM ( temp , QTY ) ,
B := MAX ( temp , QTY ) ,
C := MIN ( temp , QTY ) ,
D := COUNT ( temp ) }
EXTEND S : { WITH ( temp := ( SP MATCHING REL{ TUP{ S# S# } } ) WHERE QTY > 500 AND NOT ( PNO = 'P1' ) ) : A := SUM ( temp , QTY ) , B := MAX ( temp , QTY ) , C := MIN ( temp , QTY ) , D := COUNT ( temp ) }
[snip]

( SP MATCHING REL{ TUP{ S# S# } } ) isn't a valid expression.  The second appearance of S# is undefined (unless a variable of that name is in scope).  You seem to be imagining that the second operand of a dyadic relational operator invocation can reference attributes of the first (a mistake I have made myself before now).  Try putting those operands the other way around if you don't see what I mean.

 

> You seem to be imagining that the second operand of a dyadic relational operator invocation can reference attributes of the first

Thanks Hugh but no, that wouldn't have achieved the logic I was trying for. (What seems to you would have been an identity operation on SP.) I want/expect the free S# to denote the attribute name in the outer S. I expected (didn't imagine) that in an 'open expression' within the body of EXTEND S, free names (i.e. S#) would get bound to attribute names introduced by the target of the EXTEND. That is, even if there were a variable of that name in scope, it would get shadowed by the attribute names introduced via S. Similarly in SUM( temp, QTY), apparently, Tutorial D expects that free names in the exp (second argument to SUM, and note that it can be an exp, not just an <attribute ref>) bind to attribute names introduced by the relexp (first argument).

The r.h. operand of MATCHING (or other dyadic relops) isn't bound by attribute names from the l.h. operand. If that MATCHING were a JOIN, I'd expect I could happily put those operands the other way round.

OK If what I showed doesn't work, how do I achieve the intent of 'recalculating' temp for each tuple in S that's getting EXTENDed? What Dave shows is perhaps the only way? That is

EXTEND S :
{ temp := ( SP MATCHING REL{ TUP{ S# S# } } ) WHERE QTY > 500 AND NOT ( PNO = 'P1' ) ,
A := SUM ( temp , QTY ) ,
B := MAX ( temp , QTY ) ,
C := MIN ( temp , QTY ) ,
D := COUNT ( temp )
} {ALL BUT temp}
EXTEND S : { temp := ( SP MATCHING REL{ TUP{ S# S# } } ) WHERE QTY > 500 AND NOT ( PNO = 'P1' ) , A := SUM ( temp , QTY ) , B := MAX ( temp , QTY ) , C := MIN ( temp , QTY ) , D := COUNT ( temp ) } {ALL BUT temp}
EXTEND S :
{ temp := ( SP MATCHING REL{ TUP{ S# S# } } ) WHERE QTY > 500 AND NOT ( PNO = 'P1' ) ,
  A := SUM ( temp , QTY ) ,
  B := MAX ( temp , QTY ) ,
  C := MIN ( temp , QTY ) ,
  D := COUNT ( temp ) 
} {ALL BUT temp}

Or will that not work either? Why not?

Where does the Tutorial D spec tell us which syntactic constructs introduce attribute names into scope; and what is the textual extent of that scope in the 'body' of the construct?

See the definition of <extend> on page 24 of the latest version at the TTM website.

Your revised example looks valid to me.  The second appearance of S# is an attribute reference denoting that value of that attribute in the "current tuple" of S, the relation being extended.  Such attribute references are also permitted in SUMMARIZE and WHERE.

Thanks Hugh but I'm now thoroughly confused. If my revised example is valid, why isn't the form I gave using WITH? I can't see anything on page 24 (in the new or the non-revised text) legislates against it. Let me put it again:

EXTEND S :
{ WITH ( temp := ( SP MATCHING REL{ TUP{ S# S# } } ) WHERE QTY > 500 AND NOT ( PNO = 'P1' ) ) :
  A := SUM ( temp , QTY ) ,
  B := MAX ( temp , QTY ) ,
  C := MIN ( temp , QTY ) ,
  D := COUNT ( temp ) }
Why does the presence of the WITH within the body of EXTEND make it invalid on rhs of the WITH-bound assignment to refer to attribute names coming from the relexp that the expression is extending?
Page 24 that you reference says "The individual <name intro>s in NIC [the <name intro commalist>] are evaluated in sequence as written. "
By my reading, that means evaluated before carrying out any of the attribute-assigns in the body of the EXTEND following the WITH prefix. Therefore any attribute names appearing in the NIC must be referencing the relexp-to-extend. And that seems sensible to notionally refer to 'the current tuple' from the relexp-to-extend.
(To contrast vs how page 24 doesn't specify it, and for obvious reasons: it might say don't evaluate the assigns within the WITH but substitute the rhs expression into the attribute-assigns whenever the <introduced name> appears. That would indeed risk name capture.)
Page 24 goes on to say "The individual <attribute assign>s are effectively executed in parallel. " (that text hasn't changed). And that's fine: per tuple being extended, evaluate the WITH-constants first, then execute the attribute-assigns. So the "in parallel" means this is not like a Multiple Assignment executed in sequence of appearance.
Indeed having now seen that "executed in parallel" I'm wondering if that semantics invalidates Dave's suggestion under some circumstances: suppose temp is an existing attribute; then the attribute-assigns to A, B, C, D can execute in parallel possibly using the 'old' value of temp (or some using the old, some using the new). At some point (not specificied) the 'old' value of temp gets overwritten by Dave's attribute-assign. Dave's code than projects it away, leaving no explanation for why the values assigned to A, B, C, D are now inconsistent.

Sorry AntC, my mistake, which I realised overnight and was going to admit to it today anyway.

Hugh

Coauthor of The Third Manifesto and related books.
PreviousPage 5 of 9Next