ANNOUNCE: Rel version 3.013 now available!

#41 · February 25, 2019, 9:48 am

Quote from AntC on February 25, 2019, 9:48 am

Quote from David Livingstone on February 22, 2019, 12:27 pm

Quote from Erwin on February 19, 2019, 1:56 pm

As for "surprisingly more than I expected"

The first example of an outer join I came across was before Outer Joins were invented.

Hugh has told us about BS12's form of Outer Join (perhaps it wasn't called "Outer Join" at the time). I think you can't be talking about that long ago.

Does anyone know why the bizarre design decision to ENFORCE NULLs in SQL outer joins was taken ?

Hmm. Given that all of SQL is bizarre, especially regarding its treatment of nulls, I don't think that nulls in the result of an Outer Join is any more bizarre.

SQL was totally locked into the SELECT ... FROM ... WHERE ... template for queries. At the time you couldn't put sub-Selects in the FROM; you could only put a commalist of tables, to make a Cartesian product. (I can't remember whether INNER JOIN ... ON ... was before or after Outer Joins.) That template is why there's the dratted GROUP BY ... HAVING ... nonsense.

I can appreciate that a language that incorporates NULLs should allow the possibility of NULLs in outer join results. But to enforce them to the exclusion of a real value ??!!!!

Where are you going to get the 'real value' from? And what syntax are you going to use to do that? You could have an option saying: use default/sentinel values for the type of the result column. But could the query engine rely on there being defaults for every column?

You could supply a tuple literal containing the 'real values', except: SQL does not have tuple literals; the compiler would have to match columns from the tuple to columns from the Outer Join result, allowing for duplicate or anonymous column names and/or the foibles of SQL's column position in query results.

You could refer to another table in the database holding those defaults. (IIRC that's what BS12 did.) You'd still have all the problems with column name/position. You'd also have to throw an error if the table had more than (or less than) a single row. And how would naming that table fit into the commalist syntax for Outer Join? Note you can Outer Join to several tables in a single FROM clause; they can have intricately interlocked sets of columns in common; you'd need a separate supply of default values for each grouping of the possibly-'missing' columns.

Perhaps this is the origin of some people having concerns about outer joins if they think (understandably) that NULLs are an integral part of them, when they're not.

The idea of 'faking' an Outer Join as a UNION of the tuples that join with the tuples that don't match joined: to some default values (as does BS12 or SIRA_PRISE), or to some alternative relation (as your student did), arguably isn't an Outer Join as such, because it's expressible using operations readily available in a D. Then my take is that involving NULLs is an "integral part" of Outer Joins.

Then what's been released in Rel is not an Outer Join in any sense that any database professional would recognise. (As I said up-thread.) And to call it such in an allegedly Tutorial language is downright misleading for learners.

The alternative I proposed, using RVAs with an empty RVA in case of not matching, I think would be closer to industry norms, but I wouldn't die in a ditch over calling it an Outer Join.

Quote from David Livingstone on February 22, 2019, 12:27 pm

Quote from Erwin on February 19, 2019, 1:56 pm

As for "surprisingly more than I expected"

The first example of an outer join I came across was before Outer Joins were invented.

Hugh has told us about BS12's form of Outer Join (perhaps it wasn't called "Outer Join" at the time). I think you can't be talking about that long ago.

Does anyone know why the bizarre design decision to ENFORCE NULLs in SQL outer joins was taken ?

Hmm. Given that all of SQL is bizarre, especially regarding its treatment of nulls, I don't think that nulls in the result of an Outer Join is any more bizarre.

SQL was totally locked into the SELECT ... FROM ... WHERE ... template for queries. At the time you couldn't put sub-Selects in the FROM; you could only put a commalist of tables, to make a Cartesian product. (I can't remember whether INNER JOIN ... ON ... was before or after Outer Joins.) That template is why there's the dratted GROUP BY ... HAVING ... nonsense.

I can appreciate that a language that incorporates NULLs should allow the possibility of NULLs in outer join results. But to enforce them to the exclusion of a real value ??!!!!

Where are you going to get the 'real value' from? And what syntax are you going to use to do that? You could have an option saying: use default/sentinel values for the type of the result column. But could the query engine rely on there being defaults for every column?

You could supply a tuple literal containing the 'real values', except: SQL does not have tuple literals; the compiler would have to match columns from the tuple to columns from the Outer Join result, allowing for duplicate or anonymous column names and/or the foibles of SQL's column position in query results.

You could refer to another table in the database holding those defaults. (IIRC that's what BS12 did.) You'd still have all the problems with column name/position. You'd also have to throw an error if the table had more than (or less than) a single row. And how would naming that table fit into the commalist syntax for Outer Join? Note you can Outer Join to several tables in a single FROM clause; they can have intricately interlocked sets of columns in common; you'd need a separate supply of default values for each grouping of the possibly-'missing' columns.

Perhaps this is the origin of some people having concerns about outer joins if they think (understandably) that NULLs are an integral part of them, when they're not.

The idea of 'faking' an Outer Join as a UNION of the tuples that join with the tuples that don't match joined: to some default values (as does BS12 or SIRA_PRISE), or to some alternative relation (as your student did), arguably isn't an Outer Join as such, because it's expressible using operations readily available in a D. Then my take is that involving NULLs is an "integral part" of Outer Joins.

Then what's been released in Rel is not an Outer Join in any sense that any database professional would recognise. (As I said up-thread.) And to call it such in an allegedly Tutorial language is downright misleading for learners.

The alternative I proposed, using RVAs with an empty RVA in case of not matching, I think would be closer to industry norms, but I wouldn't die in a ditch over calling it an Outer Join.

#42 · February 25, 2019, 10:18 am

Quote from AntC on February 25, 2019, 9:48 am

Then what's been released in Rel is not an Outer Join in any sense that any database professional would recognise. (As I said up-thread.) And to call it such in an allegedly Tutorial language is downright misleading for learners.

The DBA I showed it to recognised it immediately as very effectively meeting the intent of an Outer Join, and being practical and easy-to-use in the cases where an Outer Join seems warranted. Thus, I think not only is it an effective solution to what is occasionally pointed to being an omission in Tutorial D -- and therefore a limitation compared to SQL -- it does so in a manner that is both practically useful and in keeping with the general essence of Tutorial D.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org

#43 · February 25, 2019, 10:01 pm

"arguably isn't an Outer Join as such, because it's expressible using operations readily available"

Hmmmmmmmmmmmm. And an SQL FULL OUTER JOIN isn't expressible as a DISTINCT UNION of a LEFT OUTER JOIN and a RIGHT OUTER JOIN ?

I didn't bother to figure it out in detail, but if it is, then by the same token even an SQL FULL OUTER JOIN "isn't an outer join as such". And then there goes your argument.

Besides I think SQL's LEFT/RIGHT OUTER JOIN are just SIRA_PRISE's LEFTJOIN with the "fill-in-the-gaps-value" being SQL's NULL, which truly relational systems don't have the, eurhm, "luxury" of being able to resort to. So if SIRA_PRISE is (and BS12 was) "faking", then it seems to me the very same must be said about SQL. Unless you think there is more to SQL NULL than being just another value, of course, and this even from SQL's viewpoint.

Author of SIRA_PRISE

#44 · February 26, 2019, 1:06 am

Quote from AntC on February 26, 2019, 1:06 am

Quote from Erwin on February 25, 2019, 10:01 pm

"arguably isn't an Outer Join as such, because it's expressible using operations readily available"

Hmmmmmmmmmmmm. And an SQL FULL OUTER JOIN isn't expressible as a DISTINCT UNION of a LEFT OUTER JOIN and a RIGHT OUTER JOIN ?

You elided what you quoted from me. I said " operations readily available in a D." LEFT/RIGHT OUTER JOIN are not readily available in a D: they are available in SQL. (And what's just gone into Rel does not produce a single relation as does SQL.)

I didn't bother to figure it out in detail, but if it is, then by the same token even an SQL FULL OUTER JOIN "isn't an outer join as such". And then there goes your argument.

This is easily resolved, you don't need to get Hmmmm-y: go ahead and write some SQL to produce FULL OUTER JOIN using SQL operations but avoiding Outer Joins. (See my attempt below.) Now: were all those operations available in SQL at the time David B-L is talking about?

Besides I think SQL's LEFT/RIGHT OUTER JOIN are just SIRA_PRISE's LEFTJOIN with the "fill-in-the-gaps-value" being SQL's NULL,

I hardly need to point out that SQL's Null is not a " value", it's a 'mark' (not that saying that helps much). So if you were trying to express in SQL a LEFT/RIGHT OUTER JOIN using other operations, you could do the UNION and its JOIN operand OK. But what would you say for the operand of the UNION with the non-matching rows?

which truly relational systems don't have the, eurhm, "luxury" of being able to resort to. So if SIRA_PRISE is (and BS12 was) "faking", then it seems to me the very same must be said about SQL. Unless you think there is more to SQL NULL than being just another value, of course, and this even from SQL's viewpoint.

Again: SQL's Null is not " just another value". What there is to it I don't think I'd want to call "more" than a value.

IIRC, the SQL of the vintage David B-L is talking about you couldn't explicitly set a column as Null: the query result had a null-marked column as some sort of by-default or by-accident or by-we-can't-think-of-anything-to-put-here mechanism, as a consequence of not directly mentioning that column in the SELECT list, but indirectly as *. My claim is that a Null-marked field in a query output is hard to do unless:

It's just a null-marked field from a base table; or

It arises from an explicit Outer Join; or

It's a result from an aggregate operator trying to aggregate over an empty sub-query result -- and even then I think COUNT( ) or SUM( ) will return zero rather than Null-marking.
And again I don't think you could do sub-queries like that at that vintage of SQL (it would have to be a correlated sub-query).

Here's an unpleasant experience:

( SELECT * FROM S NATURAL JOIN SP )

UNION

( SELECT S.*, MAX(SELECT P# FROM (SP WHERE SP.S# = SP.S#) AS TEETH_GNASHER ) AS P#,

MAX(SELECT QTY FROM (SP WHERE SP.S# = SP.S#) AS TEETH_GNASHER2 ) AS QTY

FROM S

WHERE NOT EXISTS (SELECT * FROM SP WHERE S.S# = SP.S#) );

Does that MAX( ) return Null-marking or blank/zero? Is it the same behaviour for a CHAR field vs an INT? I just don't know, because I'd always code that as an Outer Join (whilst holding my nose, of course).

Quote from Erwin on February 25, 2019, 10:01 pm

"arguably isn't an Outer Join as such, because it's expressible using operations readily available"

Hmmmmmmmmmmmm. And an SQL FULL OUTER JOIN isn't expressible as a DISTINCT UNION of a LEFT OUTER JOIN and a RIGHT OUTER JOIN ?

You elided what you quoted from me. I said " operations readily available in a D." LEFT/RIGHT OUTER JOIN are not readily available in a D: they are available in SQL. (And what's just gone into Rel does not produce a single relation as does SQL.)

I didn't bother to figure it out in detail, but if it is, then by the same token even an SQL FULL OUTER JOIN "isn't an outer join as such". And then there goes your argument.

This is easily resolved, you don't need to get Hmmmm-y: go ahead and write some SQL to produce FULL OUTER JOIN using SQL operations but avoiding Outer Joins. (See my attempt below.) Now: were all those operations available in SQL at the time David B-L is talking about?

Besides I think SQL's LEFT/RIGHT OUTER JOIN are just SIRA_PRISE's LEFTJOIN with the "fill-in-the-gaps-value" being SQL's NULL,

I hardly need to point out that SQL's Null is not a " value", it's a 'mark' (not that saying that helps much). So if you were trying to express in SQL a LEFT/RIGHT OUTER JOIN using other operations, you could do the UNION and its JOIN operand OK. But what would you say for the operand of the UNION with the non-matching rows?

which truly relational systems don't have the, eurhm, "luxury" of being able to resort to. So if SIRA_PRISE is (and BS12 was) "faking", then it seems to me the very same must be said about SQL. Unless you think there is more to SQL NULL than being just another value, of course, and this even from SQL's viewpoint.

Again: SQL's Null is not " just another value". What there is to it I don't think I'd want to call "more" than a value.

IIRC, the SQL of the vintage David B-L is talking about you couldn't explicitly set a column as Null: the query result had a null-marked column as some sort of by-default or by-accident or by-we-can't-think-of-anything-to-put-here mechanism, as a consequence of not directly mentioning that column in the SELECT list, but indirectly as *. My claim is that a Null-marked field in a query output is hard to do unless:

It's just a null-marked field from a base table; or
It arises from an explicit Outer Join; or
It's a result from an aggregate operator trying to aggregate over an empty sub-query result -- and even then I think COUNT( ) or SUM( ) will return zero rather than Null-marking.
And again I don't think you could do sub-queries like that at that vintage of SQL (it would have to be a correlated sub-query).

Here's an unpleasant experience:

( SELECT * FROM S NATURAL JOIN SP )

UNION

( SELECT S.*, MAX(SELECT P# FROM (SP WHERE SP.S# = SP.S#) AS TEETH_GNASHER ) AS P#,

MAX(SELECT QTY FROM (SP WHERE SP.S# = SP.S#) AS TEETH_GNASHER2 ) AS QTY

FROM S

WHERE NOT EXISTS (SELECT * FROM SP WHERE S.S# = SP.S#) );

Does that MAX( ) return Null-marking or blank/zero? Is it the same behaviour for a CHAR field vs an INT? I just don't know, because I'd always code that as an Outer Join (whilst holding my nose, of course).

The Forum for Discussion about The Third Manifesto and Related Matters