The Forum for Discussion about The Third Manifesto and Related Matters

Please or Register to create posts and topics.

Why do we need natural joins?

PreviousPage 5 of 5
Quote from Erwin on November 6, 2019, 1:31 pm
Quote from dandl on November 4, 2019, 6:23 am

A natural join on the D&D supplier database looks like this:

S join SP

The Codd theta join looks like this:

S[Sid=Sid]SP(Sid SNAME STATUS CITY Pid QTY) // projects away the duplicate Sid

The SQL version ...

 

Please, please, PLEASE.

Codd Theta-join also supported the 5 other comparison operators (<   >   <=   >=   <>) for matching tuples.  So Codd theta-join could not possibly have room for "projects away the duplicate Sid" if no information was to be lost.  If you are thinking of equi-join then please use the word equi-join.

Codd promoted (well, no, lip-serviced is a far better term for what he did) TTM's concept of attribute names [as the SOLE identifiers for attributes in a tuple/relation] because he probably had some awareness of the practical/pragmatic utility, but for the rest he was neck-deep into cartesian products with numbered attributes, at least in the earliest writings.  It rippled through to (not to say poisoned) SQL and since whatever comes out of the industry is neck-deep into SQL, whatever comes out of the industry suffers exactly as much from the exact same trouble.

In 1972 paper Codd uses the term theta join (or the symbolic equivalent) frequently, and equi-join just once. But don't read too much into it.

Whatever the kind of join it is trivially easy to construct examples in which the names given to input or output attributes/domains/columns/fields conflict.There are cases in which "projecting away" does the trick and natural join avoids that step, but it's only part of the problem.

Codd mostly avoided these by either using index numbers or by taking tiny examples in which conflicts did not arise. A real language using names would need to deal with that regardless, either by selective RENAME or explicit naming of inputs and outputs. SQL does that, but not well. Any alternative would have to do it better.

Andl - A New Database Language - andl.org
PreviousPage 5 of 5