Why do we need natural joins?
Quote from dandl on November 6, 2019, 10:42 pmQuote from Erwin on November 6, 2019, 1:31 pmQuote from dandl on November 4, 2019, 6:23 amA natural join on the D&D supplier database looks like this:
S join SP
The Codd theta join looks like this:
S[Sid=Sid]SP(Sid SNAME STATUS CITY Pid QTY) // projects away the duplicate Sid
The SQL version ...
Please, please, PLEASE.
Codd Theta-join also supported the 5 other comparison operators (< > <= >= <>) for matching tuples. So Codd theta-join could not possibly have room for "projects away the duplicate Sid" if no information was to be lost. If you are thinking of equi-join then please use the word equi-join.
Codd promoted (well, no, lip-serviced is a far better term for what he did) TTM's concept of attribute names [as the SOLE identifiers for attributes in a tuple/relation] because he probably had some awareness of the practical/pragmatic utility, but for the rest he was neck-deep into cartesian products with numbered attributes, at least in the earliest writings. It rippled through to (not to say poisoned) SQL and since whatever comes out of the industry is neck-deep into SQL, whatever comes out of the industry suffers exactly as much from the exact same trouble.
In 1972 paper Codd uses the term theta join (or the symbolic equivalent) frequently, and equi-join just once. But don't read too much into it.
Whatever the kind of join it is trivially easy to construct examples in which the names given to input or output attributes/domains/columns/fields conflict.There are cases in which "projecting away" does the trick and natural join avoids that step, but it's only part of the problem.
Codd mostly avoided these by either using index numbers or by taking tiny examples in which conflicts did not arise. A real language using names would need to deal with that regardless, either by selective RENAME or explicit naming of inputs and outputs. SQL does that, but not well. Any alternative would have to do it better.
Quote from Erwin on November 6, 2019, 1:31 pmQuote from dandl on November 4, 2019, 6:23 amA natural join on the D&D supplier database looks like this:
S join SP
The Codd theta join looks like this:
S[Sid=Sid]SP(Sid SNAME STATUS CITY Pid QTY) // projects away the duplicate Sid
The SQL version ...
Please, please, PLEASE.
Codd Theta-join also supported the 5 other comparison operators (< > <= >= <>) for matching tuples. So Codd theta-join could not possibly have room for "projects away the duplicate Sid" if no information was to be lost. If you are thinking of equi-join then please use the word equi-join.
Codd promoted (well, no, lip-serviced is a far better term for what he did) TTM's concept of attribute names [as the SOLE identifiers for attributes in a tuple/relation] because he probably had some awareness of the practical/pragmatic utility, but for the rest he was neck-deep into cartesian products with numbered attributes, at least in the earliest writings. It rippled through to (not to say poisoned) SQL and since whatever comes out of the industry is neck-deep into SQL, whatever comes out of the industry suffers exactly as much from the exact same trouble.
In 1972 paper Codd uses the term theta join (or the symbolic equivalent) frequently, and equi-join just once. But don't read too much into it.
Whatever the kind of join it is trivially easy to construct examples in which the names given to input or output attributes/domains/columns/fields conflict.There are cases in which "projecting away" does the trick and natural join avoids that step, but it's only part of the problem.
Codd mostly avoided these by either using index numbers or by taking tiny examples in which conflicts did not arise. A real language using names would need to deal with that regardless, either by selective RENAME or explicit naming of inputs and outputs. SQL does that, but not well. Any alternative would have to do it better.