# The intellectual curiosity of `NatJoin`, `InnerUnion`

Quote from AntC on May 16, 2021, 7:46 amQuote from Vadim on May 14, 2021, 6:38 pm... Regarding your critique of Anthony's motivation, you shall not dismiss curiosity factor.

Thanks Vadim ...

Looking at Appendix A's 'FORMAL DEFINITIONS' for

`<AND>`

, it's a strange beast. Judging by Stackoverflow, most SQL'ers just don't get it -- neither questioners nor answerers, even those with a considerable reputation.

`<AND>/ Natural Join`

is not the mathematicians' Cartesian Product, because of (what wikipedia calls) the "flattened" tuples in the result.- That is, by comparison it erases information: you can't tell from the result what were the headings of the operands.
- It's perhaps easier to see as (set)
`INTERSECTION`

, but that only works if the operands have same heading.- How to explain if the operands have partially-overlapping heading?
- We might say: project each operand onto the attributes in common; get the
`INTERSECTION`

of those (because that's easy to explain); extend each intersecting tuple back to that/those in the source operands.- That explanation becomes hard work if the operands have no attributes in common: project each operand on to
noattributes? What?- Perhaps that's why SQL explains
`FROM table1, table2, ...`

as more like a flattened Cartesian Product? Even if it risks duplicate columns.What's remarkable about the semantics for

`<AND>`

is that it can take any two relation values as operands, and is guaranteed to produce just another relation value as result (subject to the 'join-compatibility' of the headings).Curiosity upon curiosity, this

`<AND>`

is commutative, associative, idempotent! Relational Algebra can be like set operations.

`<AND>`

also (because it's defined across all relation values) provides a way to compare relations:

- if they have the same heading and one's body is a subset of the other's, we find
`bigger <AND> smaller`

yields`smaller`

-- and that works even if the`smaller`

is empty.- If they have same heading but only a partial overlap of tuples in their bodies, we find
`r1 <AND> r2`

yields a relation that is smaller than either -- it's body is exactly the`INTERSECT`

ion.- The real curiosity, is when we get to relation values with differing headings, 'though now we need a bit of sideways thinking:
- If one relation has a smaller heading, but at least all the tuples of the other considering only the attributes in common, we still get

`bigger-body-smaller-heading <AND> smaller-body-bigger-heading`

yields`smaller-body-...`

.- If they have a partial overlap of headings and/or a partial overlap of tuples (going by attributes in common) we still find
`r1 <AND> r2`

yields a relation that is smaller than either -- going by the smaller-body-bigger-heading ordering.So if

`<AND> / Natural Join`

is alleged to be defined across all relation values, a few of questions naturally arise:

- There must be a bigger-than-anything-else value(?)

Yes, going by our bigger-body-smaller-heading ordering, that'll be the value with minimal heading -- that is, no attributes -- and maximum body -- that is,`DEE`

.- There must be a smaller-than-anything-else value(?)

Yes, going by our smaller-body-bigger-heading ordering, that'll be the value with minimal body -- that is, empty -- and maximum heading -- that is, a rather ineffable 'all possible attributes' -- which might be a way to represent what's sometimes called the (or a) 'Universal Relation'. (Or perhaps the 'Dictionary' of a recent thread.)- Is there an operation that'll return the bigger of two relations, going by this bigger-body-smaller-heading ordering? Yes.
- If so, what does it produce a) when the bodies partially overlap (with same heading)?; b) when the headings partially overlap (with same tuples projected on the attributes in common)?; c) when there's partial overlap of headings and bodies?; d) in the degenerate cases of disjoint headings and/or disjoint bodies -- a kind of Cartesian what?
That there is a well-defined answer to each of a), b), c), d);

andthat it's provided by a single operator, is an intellectual surprise from which I still haven't recovered. Let's go through the cases, because although this seems weird, we'll bump into some familiar operations:

- Same heading, and one's body is a subset of the other's, we find
`bigger <?> smaller`

yields`bigger`

-- and that works even if the`smaller`

is empty.- a) If they have same heading but only a partial overlap of tuples in their bodies, we find
`r1 <?> r2`

yields a relation that is bigger than either -- it's body is exactly the union. I'll say again: it is exactly Codd's familiar`UNION`

.- b) If one relation has a smaller heading, but at least all the tuples of the other considering only the attributes in common, we still get

`bigger-body-smaller-heading <?> smaller-body-bigger-heading`

yields`bigger-body-...`

.- c) If they have a partial overlap of headings and/or a partial overlap of tuples (going by attributes in common) we still find
`r1 <?> r2`

yields a relation that is bigger than either -- going by the bigger-body-smaller-heading ordering.- c) supplemental: If one has a strictly smaller heading, and empty body (so also strictly smaller), we still conform to the bigger-body-smaller-heading ordering: take the smaller heading, project the bigger's body on to it. This is exactly Codd's/
Tutorial D's projection operation -- taking the attribute-comma-list from the smaller heading.

Did I say "strictly smaller heading"? Of course I meant strictly bigger in the ordering, except that a strictly smaller (empty) body means not directly comparable.- d) If the headings are disjoint, we still conform to the bigger-body-smaller-heading ordering: take the attributes in common (empty heading), project both bodies on it, form the
`UNION`

. That'll yield`DUM`

just in case both bodies are empty, otherwise`DEE`

. That the bodies are disjoint is not a problem. This is Boolean algebra's familiar`OR`

operation, treating`DEE/DUM`

as truth values.What to call this operation

`<?>`

? Appendix A takes`Natural Join`

as a generalisation of`<AND>`

. Since we've seen`<?>`

has behaviour that's a generalisation of`UNION`

, we could call it`<UNION>`

-- which is what ISBL did, or SQL has`UNION CORRESPONDING`

. We've seen its behaviour also is a generalisation of Boolean disjunction -- but`<OR>`

is already taken -- and there's grounds to allow that's just as valid as a generalisation.So I use

`InnerUnion`

; with the`Inner`

denoting to project each operand on attributes in common;`Union`

ing the result. Just asA's`<OR>`

can implementTutorial D's`UNION`

of relations with same heading, so can`InnerUnion`

. Furthermore, it can implementTutorial D's projection, taking a empty relation to denote the heading of the desired result.Perhaps not so curious by now, this

`InnerUnion`

is commutative, associative, idempotent. Its result reverses the ordering as compared with`NaturalJoin / <AND>`

, so the technical characterisation is as a lattice 'dual' operation. (But you didn't need to know anything about lattices.)

AntC

Quote from Vadim on May 14, 2021, 6:38 pm... Regarding your critique of Anthony's motivation, you shall not dismiss curiosity factor.

Thanks Vadim ...

Looking at Appendix A's 'FORMAL DEFINITIONS' for `<AND>`

, it's a strange beast. Judging by Stackoverflow, most SQL'ers just don't get it -- neither questioners nor answerers, even those with a considerable reputation.

`<AND>/ Natural Join`

is not the mathematicians' Cartesian Product, because of (what wikipedia calls) the "flattened" tuples in the result.- That is, by comparison it erases information: you can't tell from the result what were the headings of the operands.
- It's perhaps easier to see as (set)
`INTERSECTION`

, but that only works if the operands have same heading. - How to explain if the operands have partially-overlapping heading?
- We might say: project each operand onto the attributes in common; get the
`INTERSECTION`

of those (because that's easy to explain); extend each intersecting tuple back to that/those in the source operands. - That explanation becomes hard work if the operands have no attributes in common: project each operand on to
*no*attributes? What? - Perhaps that's why SQL explains
`FROM table1, table2, ...`

as more like a flattened Cartesian Product? Even if it risks duplicate columns.

What's remarkable about the semantics for `<AND>`

is that it can take any two relation values as operands, and is guaranteed to produce just another relation value as result (subject to the 'join-compatibility' of the headings).

Curiosity upon curiosity, this `<AND>`

is commutative, associative, idempotent! Relational Algebra can be like set operations.

`<AND>`

also (because it's defined across all relation values) provides a way to compare relations:

- if they have the same heading and one's body is a subset of the other's, we find
`bigger <AND> smaller`

yields`smaller`

-- and that works even if the`smaller`

is empty. - If they have same heading but only a partial overlap of tuples in their bodies, we find
`r1 <AND> r2`

yields a relation that is smaller than either -- it's body is exactly the`INTERSECT`

ion. - The real curiosity, is when we get to relation values with differing headings, 'though now we need a bit of sideways thinking:
- If one relation has a smaller heading, but at least all the tuples of the other considering only the attributes in common, we still get

`bigger-body-smaller-heading <AND> smaller-body-bigger-heading`

yields`smaller-body-...`

. - If they have a partial overlap of headings and/or a partial overlap of tuples (going by attributes in common) we still find
`r1 <AND> r2`

yields a relation that is smaller than either -- going by the smaller-body-bigger-heading ordering.

So if `<AND> / Natural Join`

is alleged to be defined across all relation values, a few of questions naturally arise:

- There must be a bigger-than-anything-else value(?)

Yes, going by our bigger-body-smaller-heading ordering, that'll be the value with minimal heading -- that is, no attributes -- and maximum body -- that is,`DEE`

. - There must be a smaller-than-anything-else value(?)

Yes, going by our smaller-body-bigger-heading ordering, that'll be the value with minimal body -- that is, empty -- and maximum heading -- that is, a rather ineffable 'all possible attributes' -- which might be a way to represent what's sometimes called the (or a) 'Universal Relation'. (Or perhaps the 'Dictionary' of a recent thread.) - Is there an operation that'll return the bigger of two relations, going by this bigger-body-smaller-heading ordering? Yes.
- If so, what does it produce a) when the bodies partially overlap (with same heading)?; b) when the headings partially overlap (with same tuples projected on the attributes in common)?; c) when there's partial overlap of headings and bodies?; d) in the degenerate cases of disjoint headings and/or disjoint bodies -- a kind of Cartesian what?

That there is a well-defined answer to each of a), b), c), d); *and* that it's provided by a single operator, is an intellectual surprise from which I still haven't recovered. Let's go through the cases, because although this seems weird, we'll bump into some familiar operations:

- Same heading, and one's body is a subset of the other's, we find
`bigger <?> smaller`

yields`bigger`

-- and that works even if the`smaller`

is empty. - a) If they have same heading but only a partial overlap of tuples in their bodies, we find
`r1 <?> r2`

yields a relation that is bigger than either -- it's body is exactly the union. I'll say again: it is exactly Codd's familiar`UNION`

. - b) If one relation has a smaller heading, but at least all the tuples of the other considering only the attributes in common, we still get

`bigger-body-smaller-heading <?> smaller-body-bigger-heading`

yields`bigger-body-...`

. - c) If they have a partial overlap of headings and/or a partial overlap of tuples (going by attributes in common) we still find
`r1 <?> r2`

yields a relation that is bigger than either -- going by the bigger-body-smaller-heading ordering. - c) supplemental: If one has a strictly smaller heading, and empty body (so also strictly smaller), we still conform to the bigger-body-smaller-heading ordering: take the smaller heading, project the bigger's body on to it. This is exactly Codd's/
**Tutorial D**'s projection operation -- taking the attribute-comma-list from the smaller heading.

Did I say "strictly smaller heading"? Of course I meant strictly bigger in the ordering, except that a strictly smaller (empty) body means not directly comparable. - d) If the headings are disjoint, we still conform to the bigger-body-smaller-heading ordering: take the attributes in common (empty heading), project both bodies on it, form the
`UNION`

. That'll yield`DUM`

just in case both bodies are empty, otherwise`DEE`

. That the bodies are disjoint is not a problem. This is Boolean algebra's familiar`OR`

operation, treating`DEE/DUM`

as truth values.

What to call this operation `<?>`

? Appendix A takes `Natural Join`

as a generalisation of `<AND>`

. Since we've seen `<?>`

has behaviour that's a generalisation of `UNION`

, we could call it `<UNION>`

-- which is what ISBL did, or SQL has `UNION CORRESPONDING`

. We've seen its behaviour also is a generalisation of Boolean disjunction -- but `<OR>`

is already taken -- and there's grounds to allow that's just as valid as a generalisation.

So I use `InnerUnion`

; with the `Inner`

denoting to project each operand on attributes in common; `Union`

ing the result. Just as **A**'s `<OR>`

can implement **Tutorial D**'s `UNION`

of relations with same heading, so can `InnerUnion`

. Furthermore, it can implement **Tutorial D**'s projection, taking a empty relation to denote the heading of the desired result.

Perhaps not so curious by now, this `InnerUnion`

is commutative, associative, idempotent. Its result reverses the ordering as compared with `NaturalJoin / <AND>`

, so the technical characterisation is as a lattice 'dual' operation. (But you didn't need to know anything about lattices.)

AntC