ANNOUNCE: Rel version 3.013 now available!

#11 · December 10, 2018, 11:08 am

Quote from Erwin on December 10, 2018, 11:04 am

Quote from Dave Voorhis on December 10, 2018, 10:56 am

[1] Not only NULL-laden, but ambiguous -- was the NULL in the original data, or a result of the outer join?

Surely, "NULL in the original data" cannot occur if we're doing this in Rel ?

Indeed, SQL NULL cannot occur in Rel, but the use of ZOO attributes as a pseudo-NULL brings back the same problems.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org

#12 · December 12, 2018, 3:54 am

Quote from Brian S on December 12, 2018, 3:54 am

Quote from Dave Voorhis on December 10, 2018, 10:56 am

Quote from Brian S on December 10, 2018, 2:03 am

Quote from Dave Voorhis on December 9, 2018, 11:48 am

Quote from Brian S on December 9, 2018, 8:58 am

I would have preferred to see a single relation having ZOO attributes instead of a tuple containing multiple relations.

What would be the benefit(s) of your ZOO attribute approach, compared to the tuple-of-RVAs approach I've used?

Multiple joins

A LEFT OUTER JOIN B LEFT OUTER JOIN C LEFT OUTER JOIN D, etc.

I'm sure you've seen SQL queries with several outer joins targeting multiple tables which yield a single result set. How would you make that happen with your solution that yields a tuple containing multiple relations for each outer join? While I'm sure it can be done, it certainly wouldn't be intuitive for practitioners with even a cursory understanding of SQL.

A chain of multiple LEFT OUTER joins all resulting in a single NULL-laden¹ result is very much a SQLism. How likely is a practitioner with even a cursory understanding of a D likely to want to implement it?

--
[1] Not only NULL-laden, but ambiguous -- was the NULL in the original data, or a result of the outer join?

I prefer the flexibility of returning a single result set from a series of inner and outer joins--especially when the database is highly normalized. Sometimes it makes sense to restrict before a join, but other times it makes better sense to restrict after or to inject default values via COALESCE.

For example,

Warehouse {WhseKey, WhseID, ...}

Item {ItemKey, ItemID, ...}

Inventory {WhseKey, ItemKey, QtyOnSO, QtyOnPO, QtyReqForXO, QtyOnXO, ...}

InvtQOH {WhseKey, ItemKey, QtyPendDecrease, QtyPendIncrease, QtyOnHand} KEY {WhseKey, ItemKey}

Inventory only contains a tuple for a particular WhseKey, ItemKey combination when a particular item can be located in a particular warehouse.

InvtQOH only contains a tuple for a particular WhseKey, ItemKey combination when any of QtyPendDecrease, QtyPendIncrease or QtyOnHand is greater than zero.

Now find out how many of a particular Item are on-hand in each warehouse:

SELECT WhseId, ItemId, COALESCE(QtyOnHand, 0) AS QtyOnHand FROM Warehouse JOIN Item JOIN Inventory LEFT OUTER JOIN InvtQOH

[1] Does it matter? If it does, change the query to reflect that it does.

Quote from Dave Voorhis on December 10, 2018, 10:56 am

Quote from Brian S on December 10, 2018, 2:03 am

Quote from Dave Voorhis on December 9, 2018, 11:48 am

Quote from Brian S on December 9, 2018, 8:58 am

I would have preferred to see a single relation having ZOO attributes instead of a tuple containing multiple relations.

What would be the benefit(s) of your ZOO attribute approach, compared to the tuple-of-RVAs approach I've used?

Multiple joins

A LEFT OUTER JOIN B LEFT OUTER JOIN C LEFT OUTER JOIN D, etc.

I'm sure you've seen SQL queries with several outer joins targeting multiple tables which yield a single result set. How would you make that happen with your solution that yields a tuple containing multiple relations for each outer join? While I'm sure it can be done, it certainly wouldn't be intuitive for practitioners with even a cursory understanding of SQL.

A chain of multiple LEFT OUTER joins all resulting in a single NULL-laden¹ result is very much a SQLism. How likely is a practitioner with even a cursory understanding of a D likely to want to implement it?

--
[1] Not only NULL-laden, but ambiguous -- was the NULL in the original data, or a result of the outer join?

I prefer the flexibility of returning a single result set from a series of inner and outer joins--especially when the database is highly normalized. Sometimes it makes sense to restrict before a join, but other times it makes better sense to restrict after or to inject default values via COALESCE.

For example,

Warehouse {WhseKey, WhseID, ...}

Item {ItemKey, ItemID, ...}

Inventory {WhseKey, ItemKey, QtyOnSO, QtyOnPO, QtyReqForXO, QtyOnXO, ...}

InvtQOH {WhseKey, ItemKey, QtyPendDecrease, QtyPendIncrease, QtyOnHand} KEY {WhseKey, ItemKey}

Inventory only contains a tuple for a particular WhseKey, ItemKey combination when a particular item can be located in a particular warehouse.

InvtQOH only contains a tuple for a particular WhseKey, ItemKey combination when any of QtyPendDecrease, QtyPendIncrease or QtyOnHand is greater than zero.

Now find out how many of a particular Item are on-hand in each warehouse:

SELECT WhseId, ItemId, COALESCE(QtyOnHand, 0) AS QtyOnHand FROM Warehouse JOIN Item JOIN Inventory LEFT OUTER JOIN InvtQOH

[1] Does it matter? If it does, change the query to reflect that it does.

#13 · December 13, 2018, 11:21 am

Quote from Dave Voorhis on December 10, 2018, 11:08 am

Quote from Erwin on December 10, 2018, 11:04 am

Quote from Dave Voorhis on December 10, 2018, 10:56 am

[1] Not only NULL-laden, but ambiguous -- was the NULL in the original data, or a result of the outer join?

Surely, "NULL in the original data" cannot occur if we're doing this in Rel ?

Indeed, SQL NULL cannot occur in Rel, but the use of ZOO attributes as a pseudo-NULL brings back the same problems.

What I meant was, in the relational version, if the input has X:INT then the question of whether an empty X:REL{X:INT} arose from the join or from the input itself, will surely not arise.

Author of SIRA_PRISE

#14 · December 13, 2018, 11:25 am

Quote from Erwin on December 13, 2018, 11:21 am

Quote from Dave Voorhis on December 10, 2018, 11:08 am

Quote from Erwin on December 10, 2018, 11:04 am

Quote from Dave Voorhis on December 10, 2018, 10:56 am

[1] Not only NULL-laden, but ambiguous -- was the NULL in the original data, or a result of the outer join?

Surely, "NULL in the original data" cannot occur if we're doing this in Rel ?

Indeed, SQL NULL cannot occur in Rel, but the use of ZOO attributes as a pseudo-NULL brings back the same problems.

What I meant was, in the relational version, if the input has X:INT then the question of whether an empty X:REL{X:INT} arose from the join or from the input itself, will surely not arise.

In Rel, no. In SQL, certainly.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org

#15 · December 16, 2018, 5:27 am

Are you saying that REL doesn't support relation-valued attributes?

The logic of missing information is actually quite clear: (1) there is an n-ary predicate; (2) it has been asserted that there is an instantiation of that n-ary predicate for a particular key value; (3) it is not clear precisely which instantiation satisfies that n-ary predicate for that particular key value.

There is an existentially-quantified sentence that is true for each and every dependent component: there is one and only one x such that P(k, x). This is the case regardless of whether it is known what that dependent component actually is.

#16 · December 16, 2018, 10:37 am

Quote from Brian S on December 16, 2018, 5:27 am

Are you saying that REL doesn't support relation-valued attributes?

The logic of missing information is actually quite clear: (1) there is an n-ary predicate; (2) it has been asserted that there is an instantiation of that n-ary predicate for a particular key value; (3) it is not clear precisely which instantiation satisfies that n-ary predicate for that particular key value.

There is an existentially-quantified sentence that is true for each and every dependent component: there is one and only one x such that P(k, x). This is the case regardless of whether it is known what that dependent component actually is.

Rel supports relation-valued attributes. They should not be used to re-invent SQL NULL.

I reject the notion that "missing information" needs to be recorded in a relational database. What does need to be recorded, in some cases, is the entirely non-missing fact that a particular attribute value is not within its usual domain.

For example, we may need to record the fact that a temperature reading did not find a certain value in degrees Celsius, and instead found a broken thermometer or defective sensor. What we actually record in a relational database are readings about the real (or a fictional, but notionally veridical) world, the domains of which can be wholly described by appropriate types.

A truly "missing" reading is not recorded, because by definition, it is missing. If we can record it, then by definition it is not missing.

Thus, "missing" can only be the absence of any record (a notion covered under the Closed World Assumption). Any present record is not missing.

Indeed, a record with "missing" information is a contradiction.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org

#17 · December 18, 2018, 10:34 am

It's not only about recording, it's also about the ability to show query results to a user.

Users sometimes want/need to be shown the missingness of something as a relevant fact. Empty relations are the only relational construct I know that are capable of achieving that. Apart from constructs such as M() which may be accused of being invented just for the purpose and therefore can arguably be regarded an abuse of the type system. (Not that I would agree lightly/easily but that position certainly is defensible to some extent.)

Author of SIRA_PRISE

#18 · December 18, 2018, 10:41 am

Quote from Erwin on December 18, 2018, 10:34 am

It's not only about recording, it's also about the ability to show query results to a user.

Users sometimes want/need to be shown the missingness of something as a relevant fact. Empty relations are the only relational construct I know that are capable of achieving that. Apart from constructs such as M() which may be accused of being invented just for the purpose and therefore can arguably be regarded an abuse of the type system. (Not that I would agree lightly/easily but that position certainly is defensible to some extent.)

You mean query results that generate missingness, like LEFT JOIN?

Empty relations can show it, but so can an identified value of a tagged union, and so can distinct attributes (as is done with LEFT|RIGHT|FULL JOIN in Rel). They're all conceptually equivalent, differing only in ergonomics. There is no logical reason to prefer one over the other.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org

#19 · December 18, 2018, 12:03 pm

Quote from Dave Voorhis on December 18, 2018, 10:41 am

Quote from Erwin on December 18, 2018, 10:34 am

It's not only about recording, it's also about the ability to show query results to a user.

Users sometimes want/need to be shown the missingness of something as a relevant fact. Empty relations are the only relational construct I know that are capable of achieving that. Apart from constructs such as M() which may be accused of being invented just for the purpose and therefore can arguably be regarded an abuse of the type system. (Not that I would agree lightly/easily but that position certainly is defensible to some extent.)

You mean query results that generate missingness, like LEFT JOIN?

Empty relations can show it, but so can an identified value of a tagged union, and so can distinct attributes (as is done with LEFT|RIGHT|FULL JOIN in Rel). They're all conceptually equivalent, differing only in ergonomics. There is no logical reason to prefer one over the other.

I agree with those last two sentences but it seems to clash at least a little bit with earlier things you said like "relation-valued attributes. They should not be used to re-invent SQL NULL."

Author of SIRA_PRISE

#20 · December 18, 2018, 12:27 pm

Quote from Erwin on December 18, 2018, 12:03 pm

Quote from Dave Voorhis on December 18, 2018, 10:41 am

Quote from Erwin on December 18, 2018, 10:34 am

It's not only about recording, it's also about the ability to show query results to a user.

Users sometimes want/need to be shown the missingness of something as a relevant fact. Empty relations are the only relational construct I know that are capable of achieving that. Apart from constructs such as M() which may be accused of being invented just for the purpose and therefore can arguably be regarded an abuse of the type system. (Not that I would agree lightly/easily but that position certainly is defensible to some extent.)

You mean query results that generate missingness, like LEFT JOIN?

Empty relations can show it, but so can an identified value of a tagged union, and so can distinct attributes (as is done with LEFT|RIGHT|FULL JOIN in Rel). They're all conceptually equivalent, differing only in ergonomics. There is no logical reason to prefer one over the other.

I agree with those last two sentences but it seems to clash at least a little bit with earlier things you said like "relation-valued attributes. They should not be used to re-invent SQL NULL."

Indeed, relation-valued or ZOO attributes shouldn't be used to re-invent SQL NULL in general, and particularly not for record-keeping.

I have less objection to use of ZOO attributes to represent generated missing values as a result of a LEFT|RIGHT|FULL JOIN, though my preference is not to do so.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org

The Forum for Discussion about The Third Manifesto and Related Matters