The Forum for Discussion about The Third Manifesto and Related Matters

Please or Register to create posts and topics.

Why 'Join'?

Why is (natural) join called 'Join'? Codd uses the term without explaining it [1970, wording much the same as 1969]. But Codd seems to use (Natural) 'Join' only for what today we'd call 'lossless Join'. Childs [1968] seems not to use the term -- although it's hard work wading through his paper.

Appendix A says its <AND> "is relational conjunction", "might logically be called conjoin". Etymologically, the Latin 'iugare' from which English 'junction' is cognate with French 'joindre' from which English 'join'.

When I started programming (late 1970's, on 'mid-range systems' with Indexed-Sequential files, when 'relational' was only available on 'big iron'), we didn't use the term. I.S. files used operations similar to Excel 'lookup' via (what amounted to) Foreign Keys -- say from Invoices to Customer Master. RPG was innovative with automatic lookup provided by the program cycle.

So (if anybody here can remember the mid-'70's), what would 'Join' mean to you before you learnt about the RM?

I presume Codd took the term from Maths. The most likely candidate would seem to be lattice theory. But ... latticists use 'meet' for what corresponds most closely to conjunction. That is, considering the simple examples of a lattice structure over the powerset of some set. Pic 1 here. And indeed Tropashko models (Natural) Join as lattice 'meet'; with his 'Inner Union' modelled as lattice 'join'. Really any other way of modelling would be very confusing in latticeland.

I did a lot of ISAM through the 70s and 80s, and it was very much about 'lookup'. Start with the customer, build a hierarchical tree by iterating over invoices and then over line items, pull in other files and fields as needed. Or start from a line item, lookup the invoice and the customer.

My early memories of 'relational' were about how you don't navigate from customer to invoice or from invoice to customer but you 'join' them into a single table with information drawn from both. It was very much an informal use of the term, nothing to do with Codd or RA or SQL and no explanation how it worked. That came much later.

Andl - A New Database Language - andl.org
Quote from AntC on May 12, 2021, 7:47 am

Etymologically, the Latin 'iugare' from which English 'junction' is cognate with French 'joindre' from which English 'join'.

I thought it was iungere (3d class, 1st 'e' mute) (to connect, to bind together), not iugare.  That would explain the 'n' in 'junction' :-).  It even explains the 'n' in 'join' :-) :-)  It also explains the explanation in my dead trees for the verb 'join' : to set or bring together, connect, combine.  I suspect he just wanted a one-syllable word (easier to remember even for the bloody foreigners, you know) for the notion I emphasized in bold.

Quote from Erwin on May 12, 2021, 5:43 pm
Quote from AntC on May 12, 2021, 7:47 am

Etymologically, the Latin 'iugare' from which English 'junction' is cognate with French 'joindre' from which English 'join'.

I thought it was iungere (3d class, 1st 'e' mute) (to connect, to bind together), not iugare.  That would explain the 'n' in 'junction' :-).  It even explains the 'n' in 'join' :-) :-)  It also explains the explanation in my dead trees for the verb 'join' : to set or bring together, connect, combine.  I suspect he just wanted a one-syllable word (easier to remember even for the bloody foreigners, you know) for the notion I emphasized in bold.

Ha, I got 'iugare' from conjunct, here. Anyhoo they're all cognate. Funny things dead languages.

But (arguably) Natural Join doesn't (only) combine: it also restricts, which is why S WHERE STATUS = 10; ≡ S <AND> REL{TUP{ STATUS 10}};. Unless (as I said) Codd intended Join to mean only 'lossless join'.

But (arguably) Natural Join doesn't (only) combine: it also restricts, which is why S WHERE STATUS = 10; ≡ S <AND> REL{TUP{ STATUS 10}};. Unless (as I said) Codd intended Join to mean only 'lossless join'.

No argument about it. Select/restrict is a join to a relcon that contains only true statements.

I believe you could replace all of SPJRUN by just semijoin/antijoin (or AND/NAND), and express semijoin in terms of antijoin. But I haven't worked through the formalism.

Don't forget that Codd described multiple kinds of join: theta, equi and natural, as well as 'join terms'. I think his use of 'join' was somewhat informal.

 

Andl - A New Database Language - andl.org
Quote from AntC on May 13, 2021, 12:45 am
Quote from Erwin on May 12, 2021, 5:43 pm
Quote from AntC on May 12, 2021, 7:47 am

Etymologically, the Latin 'iugare' from which English 'junction' is cognate with French 'joindre' from which English 'join'.

I thought it was iungere (3d class, 1st 'e' mute) (to connect, to bind together), not iugare.  That would explain the 'n' in 'junction' :-).  It even explains the 'n' in 'join' :-) :-)  It also explains the explanation in my dead trees for the verb 'join' : to set or bring together, connect, combine.  I suspect he just wanted a one-syllable word (easier to remember even for the bloody foreigners, you know) for the notion I emphasized in bold.

Ha, I got 'iugare' from conjunct, here. Anyhoo they're all cognate. Funny things dead languages.

But (arguably) Natural Join doesn't (only) combine: it also restricts, which is why S WHERE STATUS = 10; ≡ S <AND> REL{TUP{ STATUS 10}};. Unless (as I said) Codd intended Join to mean only 'lossless join'.

That same etymological dictionary says for 'join' : "from Latin iungere "to join together, unite, yoke,"" :-) :-)

It is probably too late, but if I were naming those operations today they would be "wedge" and "vee". Add a back slash before them, and you'll have a nice rendering of your formulas/queries.