Correlated subqueries

#51 · October 29, 2020, 1:49 pm

Quote from dandl on October 29, 2020, 1:49 pm

I don't see it that way. There is absolutely no point, internally or externally, where a sorted version of the relation comes into existence. Conceptually, the only thing added to the data model is an ordering function. The expression 'tree' is decorated by the addition of an ordering function, which various other parts of the tree can call upon for their own purposes. You can't save an ordered relation, pass it as an argument or distinguish it in any way. In TTM terms R .order(a) = R. They are the very same value.

So in the case of

S .order(SNAME) .select{ SNAME, Running := fold(+,STATUS) }

You might find it easier to understand if it was syntactically rewritten as

S .select{ SNAME, Running := fold(+,STATUS,{SNAME}) }

The ordering has its effect on the function, not on the data. The fold function, under the influence of an ordering argument, will sort its input privately in computing a result.

Here's another. This function adds a sequence number based on an ordering by name:

S .order(SNAME) .select{ Sequence := ORD() }

Your syntax implies a chain of operator invocations -- value of S passed to .order(), .order() evaluated and its result value passed to .select(), .select() invoking a lambda expression fold(...) -- such that the result value of the .order() evaluation must be a relation + order value -- or an "ordered relation" or non-relation -- and can't be just a relation value.

It provides a convenient mental model. I've explained what's really going on; if you were the intended customer I would listen carefully and perhaps experiment with a different syntax. The end result and the capabilities of the feature would be identical.

Your description implies that the whole expression isn't a chain of operator invocations; it's actually a calculus that looks like a chain of operator invocations.

I'm afraid I find it all rather baffling.

Willfully, I suspect. It really isn't complicated.

You might rewrite it as:

S .select{ Sequence := ORD({SNAME}) }

You could add this to Rel easily. This is simply EXTEND, with an the ORD() function that takes one or more attribute names as its argument.

Sorry, I don't understand that at all.

Trivial, to coin a phrase. ORD(attribute-list) is a function invoked for each tuple. It returns the ordinal of that tuple with respect to the ordering on attribute-list. In the case of the standard supplier data it will extend the relation by the values 1,2,3,4,5 for Adams,Blake,Clarke,Jones,Smith respectively. This syntax is not implemented: in Andl the syntax is:

S .order(SNAME) .select{ Sequence := ORD() }

I'm short of time right now, but if you like I can run some samples to show the output.

I don't see it that way. There is absolutely no point, internally or externally, where a sorted version of the relation comes into existence. Conceptually, the only thing added to the data model is an ordering function. The expression 'tree' is decorated by the addition of an ordering function, which various other parts of the tree can call upon for their own purposes. You can't save an ordered relation, pass it as an argument or distinguish it in any way. In TTM terms R .order(a) = R. They are the very same value.

So in the case of

S .order(SNAME) .select{ SNAME, Running := fold(+,STATUS) }

You might find it easier to understand if it was syntactically rewritten as

S .select{ SNAME, Running := fold(+,STATUS,{SNAME}) }

The ordering has its effect on the function, not on the data. The fold function, under the influence of an ordering argument, will sort its input privately in computing a result.

Here's another. This function adds a sequence number based on an ordering by name:

S .order(SNAME) .select{ Sequence := ORD() }

Your syntax implies a chain of operator invocations -- value of S passed to .order(), .order() evaluated and its result value passed to .select(), .select() invoking a lambda expression fold(...) -- such that the result value of the .order() evaluation must be a relation + order value -- or an "ordered relation" or non-relation -- and can't be just a relation value.

It provides a convenient mental model. I've explained what's really going on; if you were the intended customer I would listen carefully and perhaps experiment with a different syntax. The end result and the capabilities of the feature would be identical.

Your description implies that the whole expression isn't a chain of operator invocations; it's actually a calculus that looks like a chain of operator invocations.

I'm afraid I find it all rather baffling.

Willfully, I suspect. It really isn't complicated.

You might rewrite it as:

S .select{ Sequence := ORD({SNAME}) }

You could add this to Rel easily. This is simply EXTEND, with an the ORD() function that takes one or more attribute names as its argument.

Sorry, I don't understand that at all.

Trivial, to coin a phrase. ORD(attribute-list) is a function invoked for each tuple. It returns the ordinal of that tuple with respect to the ordering on attribute-list. In the case of the standard supplier data it will extend the relation by the values 1,2,3,4,5 for Adams,Blake,Clarke,Jones,Smith respectively. This syntax is not implemented: in Andl the syntax is:

S .order(SNAME) .select{ Sequence := ORD() }

I'm short of time right now, but if you like I can run some samples to show the output.

Andl - A New Database Language - andl.org

#52 · October 29, 2020, 2:09 pm

Quote from Dave Voorhis on October 29, 2020, 2:09 pm

Quote from dandl on October 29, 2020, 1:49 pm

I don't see it that way. There is absolutely no point, internally or externally, where a sorted version of the relation comes into existence. Conceptually, the only thing added to the data model is an ordering function. The expression 'tree' is decorated by the addition of an ordering function, which various other parts of the tree can call upon for their own purposes. You can't save an ordered relation, pass it as an argument or distinguish it in any way. In TTM terms R .order(a) = R. They are the very same value.

So in the case of

S .order(SNAME) .select{ SNAME, Running := fold(+,STATUS) }

You might find it easier to understand if it was syntactically rewritten as

S .select{ SNAME, Running := fold(+,STATUS,{SNAME}) }

The ordering has its effect on the function, not on the data. The fold function, under the influence of an ordering argument, will sort its input privately in computing a result.

Here's another. This function adds a sequence number based on an ordering by name:

S .order(SNAME) .select{ Sequence := ORD() }

Your syntax implies a chain of operator invocations -- value of S passed to .order(), .order() evaluated and its result value passed to .select(), .select() invoking a lambda expression fold(...) -- such that the result value of the .order() evaluation must be a relation + order value -- or an "ordered relation" or non-relation -- and can't be just a relation value.

It provides a convenient mental model. I've explained what's really going on; if you were the intended customer I would listen carefully and perhaps experiment with a different syntax. The end result and the capabilities of the feature would be identical.

Your description implies that the whole expression isn't a chain of operator invocations; it's actually a calculus that looks like a chain of operator invocations.

I'm afraid I find it all rather baffling.

Willfully, I suspect. It really isn't complicated.

No, not willfully at all. I'm afraid I can't make heads or tails of this. Perhaps there's some key element that you understand so intuitively that it hasn't been mentioned, but is necessary to appreciate what's going on.

To me, it looks like .order() emits an ordered relation-like set of tuples that other operators can either treat as a relation or as an ordered relation-like set of tuples.

You might rewrite it as:

S .select{ Sequence := ORD({SNAME}) }

You could add this to Rel easily. This is simply EXTEND, with an the ORD() function that takes one or more attribute names as its argument.

Sorry, I don't understand that at all.

Trivial, to coin a phrase. ORD(attribute-list) is a function invoked for each tuple. It returns the ordinal of that tuple with respect to the ordering on attribute-list. In the case of the standard supplier data it will extend the relation by the values 1,2,3,4,5 for Adams,Blake,Clarke,Jones,Smith respectively. This syntax is not implemented: in Andl the syntax is:

S .order(SNAME) .select{ Sequence := ORD() }

I'm short of time right now, but if you like I can run some samples to show the output.

Ah! You mean RANK:

S RANK (ASC SNAME AS Ranking)

S#
S# SNAME
NAME STATUS
INTEGER CITY
CHARACTER Ranking
INTEGER

S#("S5") NAME("Adams") 30 Athens 1

S#("S3") NAME("Blake") 30 Paris 2

S#("S4") NAME("Clark") 20 London 3

S#("S2") NAME("Jones") 10 Paris 4

S#("S1") NAME("Smith") 20 London 5

Quote from dandl on October 29, 2020, 1:49 pm

I don't see it that way. There is absolutely no point, internally or externally, where a sorted version of the relation comes into existence. Conceptually, the only thing added to the data model is an ordering function. The expression 'tree' is decorated by the addition of an ordering function, which various other parts of the tree can call upon for their own purposes. You can't save an ordered relation, pass it as an argument or distinguish it in any way. In TTM terms R .order(a) = R. They are the very same value.

So in the case of

S .order(SNAME) .select{ SNAME, Running := fold(+,STATUS) }

You might find it easier to understand if it was syntactically rewritten as

S .select{ SNAME, Running := fold(+,STATUS,{SNAME}) }

The ordering has its effect on the function, not on the data. The fold function, under the influence of an ordering argument, will sort its input privately in computing a result.

Here's another. This function adds a sequence number based on an ordering by name:

S .order(SNAME) .select{ Sequence := ORD() }

Your syntax implies a chain of operator invocations -- value of S passed to .order(), .order() evaluated and its result value passed to .select(), .select() invoking a lambda expression fold(...) -- such that the result value of the .order() evaluation must be a relation + order value -- or an "ordered relation" or non-relation -- and can't be just a relation value.

It provides a convenient mental model. I've explained what's really going on; if you were the intended customer I would listen carefully and perhaps experiment with a different syntax. The end result and the capabilities of the feature would be identical.

Your description implies that the whole expression isn't a chain of operator invocations; it's actually a calculus that looks like a chain of operator invocations.

I'm afraid I find it all rather baffling.

Willfully, I suspect. It really isn't complicated.

No, not willfully at all. I'm afraid I can't make heads or tails of this. Perhaps there's some key element that you understand so intuitively that it hasn't been mentioned, but is necessary to appreciate what's going on.

To me, it looks like .order() emits an ordered relation-like set of tuples that other operators can either treat as a relation or as an ordered relation-like set of tuples.

You might rewrite it as:

S .select{ Sequence := ORD({SNAME}) }

You could add this to Rel easily. This is simply EXTEND, with an the ORD() function that takes one or more attribute names as its argument.

Sorry, I don't understand that at all.

Trivial, to coin a phrase. ORD(attribute-list) is a function invoked for each tuple. It returns the ordinal of that tuple with respect to the ordering on attribute-list. In the case of the standard supplier data it will extend the relation by the values 1,2,3,4,5 for Adams,Blake,Clarke,Jones,Smith respectively. This syntax is not implemented: in Andl the syntax is:

S .order(SNAME) .select{ Sequence := ORD() }

I'm short of time right now, but if you like I can run some samples to show the output.

Ah! You mean RANK:

S RANK (ASC SNAME AS Ranking)

S# S#	SNAME NAME	STATUS INTEGER	CITY CHARACTER	Ranking INTEGER
S#("S5")	NAME("Adams")	30	Athens	1
S#("S3")	NAME("Blake")	30	Paris	2
S#("S4")	NAME("Clark")	20	London	3
S#("S2")	NAME("Jones")	10	Paris	4
S#("S1")	NAME("Smith")	20	London	5

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org

#53 · October 29, 2020, 9:11 pm

Quote from dandl on October 29, 2020, 5:09 am

BTW SQL has no sorted tables either.

I believe that it does. This is the result type of a query that uses ORDER BY.

Although I would call it Tuple_Array.

#54 · October 29, 2020, 9:28 pm

Quote from Erwin on October 29, 2020, 9:28 pm

Quote from Dave Voorhis on October 29, 2020, 2:09 pm

Ah! You mean RANK:

S RANK (ASC SNAME AS Ranking)

S#
S# SNAME
NAME STATUS
INTEGER CITY
CHARACTER Ranking
INTEGER

S#("S5") NAME("Adams") 30 Athens 1

S#("S3") NAME("Blake") 30 Paris 2

S#("S4") NAME("Clark") 20 London 3

S#("S2") NAME("Jones") 10 Paris 4

S#("S1") NAME("Smith") 20 London 5

And FWIW, here's how RANK is a shorthand for a combo of basic constructs already present in any language that can validly claim to be TTM-conformant :

EXTEND R with RVA1 : R WHERE [for the example at hand] NAME < NAME [there's a disambiguation problem to be resolved between the NAME of the "outer" R and that of the "inner" R but RENAME can handle that]. The point is, the tuple with S#("S5") gets an empty relation value for RVA1.

EXTEND that with RANK := COUNT(RVA1) + 1 and project away RVA1. Yields 1 for the tuple with S#("S5").

If ties are possible and present, then both(/all) of the tuples with NAME("Adams") will get RANK 1, and there will be no tuple with RANK 2.

If ties are possible and present but both(/all) must be assigned RANK 2 (/or the number that represents however many of them there are) then replace "R WHERE NAME < NAME" with "R WHERE NAME <= NAME" and replace "COUNT(RVA1) + 1" with just "COUNT(RVA1) ". First step will get an RVA with all the NAME("ADAMS") there are and second step will yield RANK 2 if there are 2, etc. etc.

Combos of ascending-on-this-and-then-descending-on-that are addressed by the obvious conjuncts of the ilk "TOPMOST_ATTR < TOPMOST_ATTR OR (TOPMOST_ATTR = TOPMOST_ATTR AND NEXTMOST_ATTR > NEXTMOST_ATTR)". Same remark as formerly made about addressing ambiguity in the names here still applies, the issue is still disregarded because that's not the main point of this post.

And to head off at the pass silly remarks like "that's not how I would implement it" : that's not how I would implement it either. This explanation merely serves to explain/define ***SEMANTICS***.

Quote from Dave Voorhis on October 29, 2020, 2:09 pm

Ah! You mean RANK:

S RANK (ASC SNAME AS Ranking)

S#
S# SNAME
NAME STATUS
INTEGER CITY
CHARACTER Ranking
INTEGER

S#("S5") NAME("Adams") 30 Athens 1

S#("S3") NAME("Blake") 30 Paris 2

S#("S4") NAME("Clark") 20 London 3

S#("S2") NAME("Jones") 10 Paris 4

S#("S1") NAME("Smith") 20 London 5

And FWIW, here's how RANK is a shorthand for a combo of basic constructs already present in any language that can validly claim to be TTM-conformant :

EXTEND R with RVA1 : R WHERE [for the example at hand] NAME < NAME [there's a disambiguation problem to be resolved between the NAME of the "outer" R and that of the "inner" R but RENAME can handle that]. The point is, the tuple with S#("S5") gets an empty relation value for RVA1.
EXTEND that with RANK := COUNT(RVA1) + 1 and project away RVA1. Yields 1 for the tuple with S#("S5").

If ties are possible and present, then both(/all) of the tuples with NAME("Adams") will get RANK 1, and there will be no tuple with RANK 2.

If ties are possible and present but both(/all) must be assigned RANK 2 (/or the number that represents however many of them there are) then replace "R WHERE NAME < NAME" with "R WHERE NAME <= NAME" and replace "COUNT(RVA1) + 1" with just "COUNT(RVA1) ". First step will get an RVA with all the NAME("ADAMS") there are and second step will yield RANK 2 if there are 2, etc. etc.

Combos of ascending-on-this-and-then-descending-on-that are addressed by the obvious conjuncts of the ilk "TOPMOST_ATTR < TOPMOST_ATTR OR (TOPMOST_ATTR = TOPMOST_ATTR AND NEXTMOST_ATTR > NEXTMOST_ATTR)". Same remark as formerly made about addressing ambiguity in the names here still applies, the issue is still disregarded because that's not the main point of this post.

And to head off at the pass silly remarks like "that's not how I would implement it" : that's not how I would implement it either. This explanation merely serves to explain/define ***SEMANTICS***.

Author of SIRA_PRISE

#55 · October 29, 2020, 9:35 pm

Quote from Darren Duncan on October 29, 2020, 9:11 pm

Quote from dandl on October 29, 2020, 5:09 am

BTW SQL has no sorted tables either.

I believe that it does. This is the result type of a query that uses ORDER BY.

Although I would call it Tuple_Array.

Hmmmmmmmmmmm. ORDER BY in SQL is available only on the outermost SELECT. Inner SELECTs (which are basically just queries in their own right even if correlated to attribute values of some outer SELECT) cannot have ORDER BY. (They ***can*** have a notion of ordering if they involve windowing functions but that is not the same thing as ORDER BY which applies to the result of the SELECT itself.) That at least supports the claim that "SQL has no sorted tables ***in general***".

Author of SIRA_PRISE

#56 · October 29, 2020, 10:31 pm

Quote from Erwin on October 29, 2020, 9:35 pm

Quote from Darren Duncan on October 29, 2020, 9:11 pm

Quote from dandl on October 29, 2020, 5:09 am

BTW SQL has no sorted tables either.

I believe that it does. This is the result type of a query that uses ORDER BY.

Although I would call it Tuple_Array.

Hmmmmmmmmmmm. ORDER BY in SQL is available only on the outermost SELECT. Inner SELECTs (which are basically just queries in their own right even if correlated to attribute values of some outer SELECT) cannot have ORDER BY. (They ***can*** have a notion of ordering if they involve windowing functions but that is not the same thing as ORDER BY which applies to the result of the SELECT itself.) That at least supports the claim that "SQL has no sorted tables ***in general***".

Just because ORDER BY isn't allowed in all parts of a query doesn't mean sorted tuples don't exist AT ALL which is what "has no" means.

A query with an ORDER BY in the outermost SELECT is logically distinct from one that doesn't, because the results are considered to be ordered versus not ordered, and so the results have different types.

#57 · October 30, 2020, 1:49 am

Trivial, to coin a phrase. ORD(attribute-list) is a function invoked for each tuple. It returns the ordinal of that tuple with respect to the ordering on attribute-list. In the case of the standard supplier data it will extend the relation by the values 1,2,3,4,5 for Adams,Blake,Clarke,Jones,Smith respectively. This syntax is not implemented: in Andl the syntax is:

S .order(SNAME) .select{ Sequence := ORD() }

I'm short of time right now, but if you like I can run some samples to show the output.

Ah! You mean RANK:

S RANK (ASC SNAME AS Ranking)

S#
S# SNAME
NAME STATUS
INTEGER CITY
CHARACTER Ranking
INTEGER

S#("S5") NAME("Adams") 30 Athens 1

S#("S3") NAME("Blake") 30 Paris 2

S#("S4") NAME("Clark") 20 London 3

S#("S2") NAME("Jones") 10 Paris 4

S#("S1") NAME("Smith") 20 London 5

Now we're getting somewhere. Yes, in effect you have EXTEND with a function RANK(asc_or_desc, attribute). This is a kind of ordered query, exactly like Andl, with a function that does some kind of internal sort on relational data. You can apply exactly the same principle to create other ordered query functions: skip/take, lead/lag and so on. The principle is the same, only the syntax is rather different.

Andl - A New Database Language - andl.org

#58 · October 30, 2020, 1:53 am

Quote from Darren Duncan on October 29, 2020, 9:11 pm

Quote from dandl on October 29, 2020, 5:09 am

BTW SQL has no sorted tables either.

I believe that it does. This is the result type of a query that uses ORDER BY.

Although I would call it Tuple_Array.

The result of an SQL query is not a table, it's a result set. Tables in SQL are not sorted.

Andl - A New Database Language - andl.org

#59 · October 30, 2020, 5:00 am

Quote from dandl on October 29, 2020, 1:49 pm

Trivial, to coin a phrase. ORD(attribute-list) is a function invoked for each tuple. It returns the ordinal of that tuple with respect to the ordering on attribute-list. In the case of the standard supplier data it will extend the relation by the values 1,2,3,4,5 for Adams,Blake,Clarke,Jones,Smith respectively. This syntax is not implemented: in Andl the syntax is:

S .order(SNAME) .select{ Sequence := ORD() }

I'm short of time right now, but if you like I can run some samples to show the output.

I think it's just a syntactic issue you have, for instance, if

S .order(SNAME) == S

then most people would conclude that

S .order(SNAME) .select{ Sequence := ORD() } == S .select{ Sequence := ORD() }

If you tweak the syntax to not imply this is the case, then I think some of the objections go away.

#60 · October 30, 2020, 5:02 am

Quote from dandl on October 30, 2020, 1:49 am

Trivial, to coin a phrase. ORD(attribute-list) is a function invoked for each tuple. It returns the ordinal of that tuple with respect to the ordering on attribute-list. In the case of the standard supplier data it will extend the relation by the values 1,2,3,4,5 for Adams,Blake,Clarke,Jones,Smith respectively. This syntax is not implemented: in Andl the syntax is:

S .order(SNAME) .select{ Sequence := ORD() }

I'm short of time right now, but if you like I can run some samples to show the output.

Ah! You mean RANK:

S RANK (ASC SNAME AS Ranking)

S#
S# SNAME
NAME STATUS
INTEGER CITY
CHARACTER Ranking
INTEGER

S#("S5") NAME("Adams") 30 Athens 1

S#("S3") NAME("Blake") 30 Paris 2

S#("S4") NAME("Clark") 20 London 3

S#("S2") NAME("Jones") 10 Paris 4

S#("S1") NAME("Smith") 20 London 5

Now we're getting somewhere. Yes, in effect you have EXTEND with a function RANK(asc_or_desc, attribute). This is a kind of ordered query, exactly like Andl, with a function that does some kind of internal sort on relational data. You can apply exactly the same principle to create other ordered query functions: skip/take, lead/lag and so on. The principle is the same, only the syntax is rather different.

What is being said here is that you can describe all window functions using a extend with rank feature, without needing the concept of a sorted list of tuples (presumably without needing a 'streaming' concept either?

The Forum for Discussion about The Third Manifesto and Related Matters