The Forum for Discussion about The Third Manifesto and Related Matters

Please or Register to create posts and topics.

Why not relational ordering operators?

Page 1 of 3Next

RM Pro 2 says:

D shall include no concept of a “relation” whose tuples are distinguishable by ordinal position. Instead, for
every relation r expressible in D, the tuples of r shall be distinguishable by value.

A relational ordering operator means the definition of a less function. As per RM Pro 2 brief discussion in DTTRM, the operator is not relational, it is a generic operator with two tuple arguments of the same type and returning a Boolean result. There is no concept of 'distinguished by ordinal position', but it would enable some useful queries, such as first, last, skip, take, lead, lag and rank.

I find the idea mooted in DTATRM and found in TD of having a separate ARRAY type absolutely pointless and unnecessary; the ordering operator solution is simple and self-evident. Without this, SQL does the job better.

Andl - A New Database Language - andl.org
Quote from dandl on May 17, 2020, 2:16 am

RM Pro 2 says:

D shall include no concept of a “relation” whose tuples are distinguishable by ordinal position. Instead, for
every relation r expressible in D, the tuples of r shall be distinguishable by value.

A relational ordering operator means the definition of a less function.

There are relational comparison operators in Tutorial D: a relation is a set, so the comparison is subset-of. Do you want to call that less?

As per RM Pro 2 brief discussion in DTTRM, the operator is not relational, it is a generic operator with two tuple arguments of the same type and returning a Boolean result. There is no concept of 'distinguished by ordinal position', but it would enable some useful queries, such as first, last, skip, take, lead, lag and rank.

A tuple is a set; so I suppose there could be a subset-of comparison between tuples. I don't see that's useful, because it would need the tuples to have different Headings. And it seems that's not what you mean. Then take these two tuples with the same Heading:

  • TUP{P# P2, S# S1, QTY 50}
  • TUP{S# S4, P# P3, QTY 50}

Which way do you want to rank/order them? Why? They have the same Heading as relvar SP, whose key is {P#, S#}{S#, P#}. Keys are sets not sequences; what reason would there be to order by S# within P# or vice versa?

I find the idea mooted in DTATRM and found in TD of having a separate ARRAY type absolutely pointless and unnecessary; the ordering operator solution is simple and self-evident. Without this, SQL does the job better.

Not at all evident to me. For SQL ORDER BY you specify a sort-ordering of column names. ORDER BY S#, P# gives a different result/different ranking vs ORDER BY P#, S#. So that's not distinguishing rows by ordinal position; nor columns by ordinal position; it's ad-hoc ordering in the SELECT statement. IIRC Tutorial D has some equivalent. I seldom used it in SQL in recent decades -- ever since Indexed-sequential file organisation died out. It's handy in alphabetical search by name, I suppose.

TTM requires support for quota queries, so it must require some ability to order/rank results.

Addit: Ah, Tutorial D's "equivalent" is probably me half-remembering LOAD ... ORDER( ); I think Rel has a PRINT ... ORDER( ); and I think there's been previous discussion on the forum. The 'result' from PRINT is not a value that can be manipulated, it can only be ... printed. It's not a relation (because it's ordered), so what would it mean to JOIN to it or EXTEND/Project/etc? Note that SQL ORDER BY can only appear at the outermost SELECT, not sub-SELECTs.

There are occasional questions on StackOverflow about how to extend a table with a ranking ordinal. They all seem to stem from textbooks that haven't been revised for 30 years. Hmm I suppose bank/account statements still like to show running balances.

Quote from AntC on May 17, 2020, 5:57 am
Quote from dandl on May 17, 2020, 2:16 am

RM Pro 2 says:

D shall include no concept of a “relation” whose tuples are distinguishable by ordinal position. Instead, for
every relation r expressible in D, the tuples of r shall be distinguishable by value.

A relational ordering operator means the definition of a less function.

There are relational comparison operators in Tutorial D: a relation is a set, so the comparison is subset-of. Do you want to call that less?

No, less is an ordering operator, to be supplied by the programmer. For example, for the purpose of ranking the S relation by CITY the operator might define that a tuple is less if the CITY name is shorter. Then the ranking will be by length of city name.

Not at all evident to me. For SQL ORDER BY you specify a sort-ordering of column names. ORDER BY S#, P# gives a different result/different ranking vs ORDER BY P#, S#. So that's not distinguishing rows by ordinal position; nor columns by ordinal position; it's ad-hoc ordering in the SELECT statement. IIRC Tutorial D has some equivalent. I seldom used it in SQL in recent decades -- ever since Indexed-sequential file organisation died out. It's handy in alphabetical search by name, I suppose.

Correct. It's a vital feature if you want queries with paging (skip, take) but it also enables queries with lag, lead, rank, running sum and the like. It's a gap in TTM that's really cumbersome to fill any other way.

TTM requires support for quota queries, so it must require some ability to order/rank results.

Yes. I dislike the term because (like GTC) it picks out one part of a more general problem. The example in DTATRM shows ranking by WEIGHT the hard way.

But yes, I guess you could say this is an exploration of a more general view of RM VSS 5.

Andl - A New Database Language - andl.org

Conceptually tuples don't have a natural sorting order in the general case so applying "less" etc to them is not useful for a normal user query.

What is actually useful is having a user-defined mapping function that takes a tuple as input and results in something which does have a natural sort order, and then when a user wants to sort tuples they say use this function to facilitate it.

This is a more general form of how SQL does it, where the SQL form has that mapping function extract one or more attribute in an ordered sequence.

The only value I see in a generic system-defined tuple "less" function is if one wants to put a relation in AN canonical order for the purpose of hashing the entire relation for indexing it as an RVA or such thing.

Quote from dandl on May 17, 2020, 2:16 am

RM Pro 2 says:

D shall include no concept of a “relation” whose tuples are distinguishable by ordinal position. Instead, for
every relation r expressible in D, the tuples of r shall be distinguishable by value.

A relational ordering operator means the definition of a less function. As per RM Pro 2 brief discussion in DTTRM, the operator is not relational, it is a generic operator with two tuple arguments of the same type and returning a Boolean result. There is no concept of 'distinguished by ordinal position', but it would enable some useful queries, such as first, last, skip, take, lead, lag and rank.

I find the idea mooted in DTATRM and found in TD of having a separate ARRAY type absolutely pointless and unnecessary; the ordering operator solution is simple and self-evident. Without this, SQL does the job better.

What job ?

"No concept of tuples distinguished by ordinal position" does ***not*** mean that "some useful queries" are ruled out.  They are not ruled out.  Re-read the information principle.  If information is relevant to the user, then that information shall be present as ***values of attributes in tuples in relations***.  The only thing forbidden is that the system starts offering facilities that go beyond this rule, such as, e.g., "ordered" relations where "position of a tuple" has a material effect on, e.g., the results of the relation equality operator, that is to say, where "equality of ordered relations" is like java's LinkedList.equals() and not Set.equals().  What is manifestly not forbidden, is relations whose heading includes a RANK or ORDINAL_POS attribute.  Fits the IP perfectly, and is therefore the way to go for your "some useful queries".

Quote from AntC on May 17, 2020, 5:57 am
Quote from dandl on May 17, 2020, 2:16 am

RM Pro 2 says:

D shall include no concept of a “relation” whose tuples are distinguishable by ordinal position. Instead, for
every relation r expressible in D, the tuples of r shall be distinguishable by value.

A relational ordering operator means the definition of a less function.

There are relational comparison operators in Tutorial D: a relation is a set, so the comparison is subset-of. Do you want to call that less?

As per RM Pro 2 brief discussion in DTTRM, the operator is not relational, it is a generic operator with two tuple arguments of the same type and returning a Boolean result. There is no concept of 'distinguished by ordinal position', but it would enable some useful queries, such as first, last, skip, take, lead, lag and rank.

A tuple is a set; so I suppose there could be a subset-of comparison between tuples. I don't see that's useful, because it would need the tuples to have different Headings. And it seems that's not what you mean. Then take these two tuples with the same Heading:

  • TUP{P# P2, S# S1, QTY 50}
  • TUP{S# S4, P# P3, QTY 50}

Which way do you want to rank/order them? Why? They have the same Heading as relvar SP, whose key is {P#, S#}{S#, P#}. Keys are sets not sequences; what reason would there be to order by S# within P# or vice versa?

I find the idea mooted in DTATRM and found in TD of having a separate ARRAY type absolutely pointless and unnecessary; the ordering operator solution is simple and self-evident. Without this, SQL does the job better.

Not at all evident to me. For SQL ORDER BY you specify a sort-ordering of column names. ORDER BY S#, P# gives a different result/different ranking vs ORDER BY P#, S#. So that's not distinguishing rows by ordinal position; nor columns by ordinal position; it's ad-hoc ordering in the SELECT statement. IIRC Tutorial D has some equivalent. I seldom used it in SQL in recent decades -- ever since Indexed-sequential file organisation died out. It's handy in alphabetical search by name, I suppose.

TTM requires support for quota queries, so it must require some ability to order/rank results.

Addit: Ah, Tutorial D's "equivalent" is probably me half-remembering LOAD ... ORDER( ); I think Rel has a PRINT ... ORDER( ); and I think there's been previous discussion on the forum. The 'result' from PRINT is not a value that can be manipulated, it can only be ... printed. It's not a relation (because it's ordered), so what would it mean to JOIN to it or EXTEND/Project/etc? Note that SQL ORDER BY can only appear at the outermost SELECT, not sub-SELECTs.

There are occasional questions on StackOverflow about how to extend a table with a ranking ordinal. They all seem to stem from textbooks that haven't been revised for 30 years. Hmm I suppose bank/account statements still like to show running balances.

Rel defines an operator called ORDER(...) that returns an ARRAY (an ordered, random-access bag of tuples) for a relation operand. It's shorthand for declaring an ARRAY and using LOAD to populate it. There is no PRINT.

Rel also defines its inverse called UNORDER() that returns a relation given an ARRAY operand.

You can certainly define a system inspired by the relational model that doesn't use relations, so that you can specify row/tuple orders (and have duplicate rows?) should you want them. That's what SQL does. A correspondent of mine described such systems (including SQL) as implementing the "bagatational" model because such row-ordered, duplicate-allowing containers are bags, not relations.

In the relational model, tuples appearing in a relation in some specified ordering is meaningless, as that's not a property of a relation. By definition, a specific tuple ordering can only exist in some non-relation construct. As already noted, the only way to "order" tuples in a relation is to represent the tuple order with a rank attribute.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
Quote from Dave Voorhis on May 17, 2020, 9:22 am

You can certainly define a system inspired by the relational model that doesn't use relations, so that you can specify row/tuple orders (and have duplicate rows?) should you want them. That's what SQL does. A correspondent of mine described such systems (including SQL) as implementing the "bagatational" model because such row-ordered, duplicate-allowing containers are bags, not relations.

In the relational model, tuples appearing in a relation in some specified ordering is meaningless, as that's not a property of a relation. By definition, a specific tuple ordering can only exist in some non-relation construct. As already noted, the only way to "order" tuples in a relation is to represent the tuple order with a rank attribute.

I would say there are actually THREE interesting variants of a tuple collection here, which are defined over sets, bags, and arrays respectively.

  • A normal relation is based on a set of tuples, being unordered and without duplicates.
  • A bag of tuples like SQL's TABLE is ALSO unordered, and its properties are a fairly straightforward generalization of a relation.
  • An array of tuples is the only option that is ordered, and it also allows duplicates, this is what the result of ORDER BY is.

All 3 of the above cases can be relation-like in having the concept of a heading and a tuple-composed body with generalizations of the same behaviors.

 

Quote from Darren Duncan on May 17, 2020, 7:38 am

Conceptually tuples don't have a natural sorting order in the general case so applying "less" etc to them is not useful for a normal user query.

What is actually useful is having a user-defined mapping function that takes a tuple as input and results in something which does have a natural sort order, and then when a user wants to sort tuples they say use this function to facilitate it.

No, that's the wrong concept. My less function ranks two tuples for the purpose of a query, such as ranking parts by weight.

This is a more general form of how SQL does it, where the SQL form has that mapping function extract one or more attribute in an ordered sequence.

If you mean ORDER BY, it's not easy to use that for ranking.

The only value I see in a generic system-defined tuple "less" function is if one wants to put a relation in AN canonical order for the purpose of hashing the entire relation for indexing it as an RVA or such thing.

The only value? I listed a series of queries that depend on it: rank, skip/take, lag/lead, running sum etc.

Andl - A New Database Language - andl.org
Quote from Darren Duncan on May 17, 2020, 11:00 pm
Quote from Dave Voorhis on May 17, 2020, 9:22 am

You can certainly define a system inspired by the relational model that doesn't use relations, so that you can specify row/tuple orders (and have duplicate rows?) should you want them. That's what SQL does. A correspondent of mine described such systems (including SQL) as implementing the "bagatational" model because such row-ordered, duplicate-allowing containers are bags, not relations.

In the relational model, tuples appearing in a relation in some specified ordering is meaningless, as that's not a property of a relation. By definition, a specific tuple ordering can only exist in some non-relation construct. As already noted, the only way to "order" tuples in a relation is to represent the tuple order with a rank attribute.

I would say there are actually THREE interesting variants of a tuple collection here, which are defined over sets, bags, and arrays respectively.

  • A normal relation is based on a set of tuples, being unordered and without duplicates.
  • A bag of tuples like SQL's TABLE is ALSO unordered, and its properties are a fairly straightforward generalization of a relation.
  • An array of tuples is the only option that is ordered, and it also allows duplicates, this is what the result of ORDER BY is.

All 3 of the above cases can be relation-like in having the concept of a heading and a tuple-composed body with generalizations of the same behaviors.

"Relation-like"?

That borders on a 4 being 3-like because they only differ by 1.

:-)

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
Quote from Erwin on May 17, 2020, 8:47 am
Quote from dandl on May 17, 2020, 2:16 am

RM Pro 2 says:

D shall include no concept of a “relation” whose tuples are distinguishable by ordinal position. Instead, for
every relation r expressible in D, the tuples of r shall be distinguishable by value.

A relational ordering operator means the definition of a less function. As per RM Pro 2 brief discussion in DTTRM, the operator is not relational, it is a generic operator with two tuple arguments of the same type and returning a Boolean result. There is no concept of 'distinguished by ordinal position', but it would enable some useful queries, such as first, last, skip, take, lead, lag and rank.

I find the idea mooted in DTATRM and found in TD of having a separate ARRAY type absolutely pointless and unnecessary; the ordering operator solution is simple and self-evident. Without this, SQL does the job better.

What job ?

"No concept of tuples distinguished by ordinal position" does ***not*** mean that "some useful queries" are ruled out.  They are not ruled out.  Re-read the information principle.  If information is relevant to the user, then that information shall be present as ***values of attributes in tuples in relations***.  The only thing forbidden is that the system starts offering facilities that go beyond this rule, such as, e.g., "ordered" relations where "position of a tuple" has a material effect on, e.g., the results of the relation equality operator, that is to say, where "equality of ordered relations" is like java's LinkedList.equals() and not Set.equals().  What is manifestly not forbidden, is relations whose heading includes a RANK or ORDINAL_POS attribute.  Fits the IP perfectly, and is therefore the way to go for your "some useful queries".

I think you're agreeing with me. TTM does not forbid attributes of that kind, and presumably the operators required to compute them. The functionless(tuple,tuple) is such an operator.

Andl - A New Database Language - andl.org
Page 1 of 3Next