The Forum for Discussion about The Third Manifesto and Related Matters

Please or Register to create posts and topics.

A question after re-reading Codd

My question relates to RELATIONAL COMPLETENESS OF DATA BASE SUBLANGUAGES Codd 1972. I'm hoping that erudite persons on this list can assist.

There are many links on the Web to examples of 'relational algebra' but the form of those equations is quite different in both form and principle from Codd's (see the algebraic expressions given on p13, his numbering). Why is that, and where is the original source for the form now used?

[I would include examples, but in this editor it's just too hard. I can spell it out if it's not instantly obvious.]

I'm also curious whether there is any corresponding modern equivalent of Codd's relational calculus.

Andl - A New Database Language - andl.org

There is probably not even any such thing as "the" form now used.  Let alone a single original source for it.  I've seen textbooks (or rather student's questions on SO copying fragments of their textbook) that included an RA notation for aggregations and the like.  All it takes to come up with such a notation is understanding aggregations and some unused greek letter.

Codd's writings use the domain-ordered approach "for notational and expository convenience" (the convenience was mostly his, we could say with the luxury of hindsight).  Probably all of the later works recognized the relevance of role names (which Codd anticipated and are now known as attribute names) and the relevance of incorporating them in the RA notation for increased readability (that is, less decrypting to do for the reader), and effecitvely started doing that.  It makes for a great difference in notation indeed but the most likely answer to your "and where is ..." is that it just grew organically over time.

Author of SIRA_PRISE
Quote from dandl on October 26, 2019, 1:48 pm

My question relates to RELATIONAL COMPLETENESS OF DATA BASE SUBLANGUAGES Codd 1972. I'm hoping that erudite persons on this list can assist.

There are many links on the Web to examples of 'relational algebra' but the form of those equations is quite different in both form and principle from Codd's (see the algebraic expressions given on p13, his numbering). Why is that, and where is the original source for the form now used?

As Erwin says, the 'original source's are text books. Also there's a few 'try it online' engines with some form of (what they call) RA. The Relax calculator is quite a popular one; I point it out because you can see the mess resulting from trying to compromise between an Algebra and SQL.

Never the less I don't see "quite different" from Codd's: mostly we get Codd's plus extra operators; in context of 'named perspective' rather than domain-ordered, as Erwin says.

[I would include examples, but in this editor it's just too hard. I can spell it out if it's not instantly obvious.]

I'm also curious whether there is any corresponding modern equivalent of Codd's relational calculus.

I think there's an implementation of the Domain Relational Calculus. But really there's a gazillion query tools/languages. Most of the tools are grid-based (as in Ms Access or Rel), and generate SQL or some form of RA expressions behind the scenes. I'm not seeing why these days you'd want a language/algebra/calculus when you can make queries graphically.

Quote from AntC on October 26, 2019, 10:14 pm

But really there's a gazillion query tools/languages. Most of the tools are grid-based (as in Ms Access or Rel), and generate SQL or some form of RA expressions behind the scenes. I'm not seeing why these days you'd want a language/algebra/calculus when you can make queries graphically.

If I may clarify and expand:

  • Rel uses grids for data entry and to show query results. It generates Tutorial D behind the scenes.
  • MS Access uses grids for data entry and to show query results. It generates SQL behind the scenes.
  • Rel uses an "icons on strings" metaphor in a graphical query language (which I call Rev) to create queries, or you can write them as Tutorial D text.
  • MS Access uses its own dialect of Query By Example (see https://en.wikipedia.org/wiki/Query_by_Example) to create queries "visually" -- which is rather grid-like -- or you can write them as SQL text.
I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
Quote from dandl on October 26, 2019, 1:48 pm

My question relates to RELATIONAL COMPLETENESS OF DATA BASE SUBLANGUAGES Codd 1972. I'm hoping that erudite persons on this list can assist.

There are many links on the Web to examples of 'relational algebra' but the form of those equations is quite different in both form and principle from Codd's (see the algebraic expressions given on p13, his numbering). Why is that, and where is the original source for the form now used?

[I would include examples, but in this editor it's just too hard. I can spell it out if it's not instantly obvious.]

I'm also curious whether there is any corresponding modern equivalent of Codd's relational calculus.

I don't see much (any?) difference in semantics, but there's certainly some difference in choice of symbols. Convention around use of symbols in every symbol-using subject tends to evolve one academic paper and/or popular textbook at a time, but in the early 1970's it was often heavily shaped by whatever you could bang out on a typewriter or (if you were suitably patient and capable) legibly draw freehand.

There's also tended to be at least an informal recognition that specific operator semantics are less important than the general idea, i.e., there are relations and operators that emit relations, and the important -- and thus fairly consistent across the literature -- operators are restrict, project and (natural) join, and maybe times, intersect, minus and union. Add others to taste, something something something, and rename, you'll need that too.

Though I wouldn't be surprised to find there's a specific source that nails down the "modern" set of operators and associated conventions. I don't know what it is.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
Quote from Erwin on October 26, 2019, 6:27 pm

There is probably not even any such thing as "the" form now used.  Let alone a single original source for it.  I've seen textbooks (or rather student's questions on SO copying fragments of their textbook) that included an RA notation for aggregations and the like.  All it takes to come up with such a notation is understanding aggregations and some unused greek letter.

Codd's writings use the domain-ordered approach "for notational and expository convenience" (the convenience was mostly his, we could say with the luxury of hindsight).  Probably all of the later works recognized the relevance of role names (which Codd anticipated and are now known as attribute names) and the relevance of incorporating them in the RA notation for increased readability (that is, less decrypting to do for the reader), and effecitvely started doing that.  It makes for a great difference in notation indeed but the most likely answer to your "and where is ..." is that it just grew organically over time.

The differences are much deeper, if you go back and compare closely. In particular, Codd has:

  • optionally uses either domain names or ordinal index to refer to attributes eg supplier[1] or supplier[A]
  • ses [] and (), for example selection supplier[1] or projection supplier(1,2,3).
  • nothing in the form attribute="literal", restriction is based on [attribute=attribute].
  • a compact form of anonymous relation literal {14}
  • theta joins, which mention attributes explicitly.

I confess I quite like it. Why did no-one use it? Who changed it?

None of this looks remotely like https://en.wikipedia.org/wiki/Relational_algebra.

Andl - A New Database Language - andl.org
Quote from AntC on October 26, 2019, 10:14 pm

I think there's an implementation of the Domain Relational Calculus. But really there's a gazillion query tools/languages. Most of the tools are grid-based (as in Ms Access or Rel), and generate SQL or some form of RA expressions behind the scenes. I'm not seeing why these days you'd want a language/algebra/calculus when you can make queries graphically.

If you want to just query existing data, a graphical tool (or even Excel) is better than typing in the SQL query.

But if you want to develop application programs with your own UI and database storage, a visual tool is of no earthly use. That's the niche.

Andl - A New Database Language - andl.org
Quote from dandl on October 27, 2019, 12:28 am
Quote from Erwin on October 26, 2019, 6:27 pm

There is probably not even any such thing as "the" form now used.  Let alone a single original source for it.  I've seen textbooks (or rather student's questions on SO copying fragments of their textbook) that included an RA notation for aggregations and the like.  All it takes to come up with such a notation is understanding aggregations and some unused greek letter.

Codd's writings use the domain-ordered approach "for notational and expository convenience" (the convenience was mostly his, we could say with the luxury of hindsight).  Probably all of the later works recognized the relevance of role names (which Codd anticipated and are now known as attribute names) and the relevance of incorporating them in the RA notation for increased readability (that is, less decrypting to do for the reader), and effecitvely started doing that.  It makes for a great difference in notation indeed but the most likely answer to your "and where is ..." is that it just grew organically over time.

The differences are much deeper, if you go back and compare closely. In particular, Codd has:

  • optionally uses either domain names or ordinal index to refer to attributes eg supplier[1] or supplier[A]
  • ses [] and (), for example selection supplier[1] or projection supplier(1,2,3).
  • nothing in the form attribute="literal", restriction is based on [attribute=attribute].
  • a compact form of anonymous relation literal {14}
  • theta joins, which mention attributes explicitly.

I confess I quite like it. Why did no-one use it? Who changed it?

a) Whether or not these are "deeper" differences depends on one's definition of "deeper". They look superficial and mainly syntactic to me.

b) If all restriction is [attribute = attribute] and nothing like attribute = "literal", that seems like an awkward limitation.

c) Theta join frequently pops up in the literature, particularly textbooks. Whether or not a system provides it seems to be a matter of personal (designer's) preference.

d) Referencing attributes by numeric index is almost universally considered bad. Fine in a client-side library, maybe, but in a query language?  No.

Etc. In academic circles it's considered archaic and quaint.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
Quote from dandl on October 27, 2019, 12:58 am
Quote from AntC on October 26, 2019, 10:14 pm

I think there's an implementation of the Domain Relational Calculus. But really there's a gazillion query tools/languages. Most of the tools are grid-based (as in Ms Access or Rel), and generate SQL or some form of RA expressions behind the scenes. I'm not seeing why these days you'd want a language/algebra/calculus when you can make queries graphically.

If you want to just query existing data, a graphical tool (or even Excel) is better than typing in the SQL query.

But if you want to develop application programs with your own UI and database storage, a visual tool is of no earthly use. That's the niche.

Speaking from experience in creating complex queries in Rel's Tutorial D dialect for real-world purposes, being able to define queries using some visual (or other "generator") tool -- particularly with a means to easily create and examine the results of subexpressions -- is enormously helpful.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org