The Forum for Discussion about The Third Manifesto and Related Matters

Please or Register to create posts and topics.

Codd 1970 'domain' does not mean Date 2016 'type' [was: burble about Date's IM]

PreviousPage 15 of 22Next
Quote from Brian S on December 5, 2019, 4:07 am

You miss the point.

Not at all. The point is that we need some measure of agreement on terminology before we can usefully discuss whatever point you're making. You said 'common sense' and that was no help at all. I refer to Wikipedia as a potential source of common ground, but if you reject that, what starting point do we use?

The Universe of Discourse is not just a convenient convention: it is what predicates range over in first-order predicate logic.  The Universe of ZFC set theory includes only sets.  Nothing that can be perceived is in the Universe of ZFC set theory, because perception involves the five senses.  The Universe of Discourse for a relational database is not merely the class of all sets.  Nor is it the union of all declared or system-defined types.  No.  The Universe for a relational database also includes objects which may be located in time or in time and space.  How do I know this?  Because relation variables can contain different relation values at different times, and as a consequence, the propositions expressed by instantiating relvar predicates must reference, how ever indirectly, that which is located in time or in time and space.  Bottom line: if the universe for a relational database only included objects which are independent of time, then the contents of the database could not change over time.

The University of Discourse: wiki. It's a very broad term with a range of meanings. WE can use yours, but we must bear in mind that's a choice, a convention if you like.

Yes, I would agree that a time-varying relation necessarily asserts facts that are only true with reference to time. That does not automatically require that those facts deal with concrete entities or entities located in space. Businesses tend to be pre-occupied with physical objects, but I could use a relational database to track the evolution of a mathematical model or of a set of scientific observations with no link whatsoever to a physical location.

Side-note: the human brain almost certainly has concepts of time and space deeply embedded in the core algorithms of the neocortex. We don't have to impose those notions on everything we study.

Instead of looking up wiki, use your own brain: being abstract denotes not possibly located in time or in time and space; being concrete denotes possibly located in time or in time and space.  In other words, if it can be located in time or in time and space, then it is not abstract.  Period.  The terms "abstract" and "concrete" have many other meanings, and the wiki mentions some of them, but in the context of relational database theory, "abstract" means precisely, "not possibly located in time or in time and space."  Nothing more.  To say that something is located in time is to say that that something's existence is contingent on that location in time.  A human being exists from the moment sperm and egg combine.  It's existence is therefore contingent on the moment of conception.  It did not exist prior to that moment.  Everything that has a location in time exists contingent on not only the moment it came into existence, but every moment of its lifetime.  And to be clear: a moment is an interval of time that has no internal structure--that is, it cannot be subdivided into smaller intervals.

I hope that clarifies.

It clarifies what you should have presented as your own set of definitions, not anything that is inescapably true. I am perfectly happy to accept terms like UoD, abstract, concrete given specific meanings for the purposes of a specific discussion, just not as absolutes. Then we see how well they support your following arguments.

But you're back on shaky ground with 'moments' and trying to link back to the physical worls. Physical objects do no come into existence, nor do they disappear. Relations do not record moments, however that may be defined. Relations in a database record facts that have been asserted by a competent authority as at some point in time. You can assume that the fact was true for some earlier time and will become false at some future time, but that is not recorded. The assertion of a fact bears no necessary relationship to any underlying physical reality. It is the assertion of the fact that is located in time, nothing else.

Andl - A New Database Language - andl.org
Quote from Brian S on December 5, 2019, 4:07 am
Quote from dandl on November 16, 2019, 7:42 am
Quote from Brian S on November 16, 2019, 3:47 am
Quote from dandl on November 3, 2019, 2:48 am

I think what you're saying has a reasonable basis, that the principle is worthy but the naming is at issue.

It's clearly an aspect of philosophy, which means that it must surely have been discussed endlessly since the time of the Greeks (without the need for any conclusion, of course). Do you have a reference?

A reference?  That the physical universe came into being at the beginning of time is what I was taught in grade school back in the '70s.  It's what's called "common knowledge."  But we're not here to contemplate infinity.  What's relevant here is that that which is being discussed is by definition part of the Universe of Discourse, and that the physical universe is not independent of time.  Which means that the physical universe, and everything that can be perceived therein is by definition concrete.

[This post seems out of context -- I'm not sure what preceded it.]

The curious thing about common knowledge is that it too is almost impossible to pin down. If we belong to a community there may be knowledge commonly held by that community, and much of it it may be dead wrong. What keeps a bicycle upright?

If we belong to different communities we may each quote common knowledge and find we disagree at a basic level. Is it polite to slurp your soup?

No, I wasn't taught that in grade school (whatever period that may correspond to). I think most people would think the universe started at a point in time (or was created), and would find the idea of a 'beginning of time' hard to accept. I learned this particular concept as one accompanying a particular cosmogenic view (Big Bang), so I do know of it, but I view it as conjecture, not fact.

I accept the Universe of Discourse as a convenient convention: the complete range of objects, events, attributes, relations, ideas, etc, that are expressed, assumed, or implied in a discussion. I accept the idea of the Physical Universe, and the idea of Time. I do not accept the particular relationship you construct between them, or the logical conclusion you draw. It just ain't that simple!

I was easily able to find philosophical discussion on the distinction between abstract and concrete eg wiki.

There are parallels with what you say, but also differences. Is a concrete example itself concrete or abstract? Is the Theory of Relativity concrete or abstract? It is certainly fixed in time, if not space. Is a subatomic particle concrete? There is lots more like this here.

So I accept your proposal as a convenient way of dealing with certain kinds of data analysis, but not more.

 

You miss the point.  The Universe of Discourse is not just a convenient convention: it is what predicates range over in first-order predicate logic.

Brian, you're back on your hobby-horse. You've said this many times; you've got it disagreed with many times, so I'll be brief.

Within Relational Theory we adopt multi-sorted logic. So (if you insist on appealing to 'U of D') there are multiple, disjoint Universes of Discourse; and applying FOPL and set theory needs some care, because not many texts consider the ramifications of multi-sorted.

The Universe of ZFC set theory includes only sets.  Nothing that can be perceived is in the Universe of ZFC set theory, because perception involves the five senses.  The Universe of Discourse for a relational database is not merely the class of all sets.  Nor is it the union of all declared or system-defined types.  No.  The Universe for a relational database also includes objects which may be located in time or in time and space.

This is just bumbling confusion: the database does not "include objects". It represents or models objects. (And some would say the database records observations, which might or might not be observations of objects with some external existence, but the database records nothing more than the observations. And some would say the database doesn't even record observations: it records what authorised users tell it; they might be lying; they might be imagining.)

We simply don't need to hypothesise a world (concrete or abstract, contingent or eternally true) to operate a database and the logic behind it. If we give a characteristic predicate for a relvar, that might help authorised users tell us stuff (which will help the enterprise carry out its business functions, presumably). We simply don't need all your metaphysics: at best it's a distraction; at worst it leads people like you (and Fabian and David McG and the argumentative persona on StackOverflow) into deep, dark alleys.

Quote from AntC on December 6, 2019, 10:27 pm

We simply don't need to hypothesise a world (concrete or abstract, contingent or eternally true) to operate a database and the logic behind it. If we give a characteristic predicate for a relvar, that might help authorised users tell us stuff (which will help the enterprise carry out its business functions, presumably). We simply don't need all your metaphysics: at best it's a distraction; at worst it leads people like you (and Fabian and David McG and the argumentative persona on StackOverflow) into deep, dark alleys.

What he said.

But I do agree there is a lingering problem with time. Everyone agrees on something to do with 'time-varying' but not exactly what that means, and this question is at the core in thinking about updates. It's not helped at all by the way TTM treats relvars, relvar assignment or transactions. I'm sure there is a better way, based on a better underlying paradigm.

Andl - A New Database Language - andl.org
Quote from dandl on December 6, 2019, 11:07 pm
Quote from AntC on December 6, 2019, 10:27 pm

We simply don't need to hypothesise a world (concrete or abstract, contingent or eternally true) to operate a database and the logic behind it. If we give a characteristic predicate for a relvar, that might help authorised users tell us stuff (which will help the enterprise carry out its business functions, presumably). We simply don't need all your metaphysics: at best it's a distraction; at worst it leads people like you (and Fabian and David McG and the argumentative persona on StackOverflow) into deep, dark alleys.

What he said.

But I do agree there is a lingering problem with time. Everyone agrees on something to do with 'time-varying' but not exactly what that means, and this question is at the core in thinking about updates. It's not helped at all by the way TTM treats relvars, relvar assignment or transactions. I'm sure there is a better way, based on a better underlying paradigm.

"Time-varying" is no more or less a problem in TTM than it is in programming in general. The better paradigm (for an undefined value of "better") is, as usual, functional programming (with the usual caveats that accompany a suggestion of functional programming.)

That said, see Time and Relational Theory by Date, Darwen and Lorentzos.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
Quote from Dave Voorhis on December 6, 2019, 11:30 pm
Quote from dandl on December 6, 2019, 11:07 pm
Quote from AntC on December 6, 2019, 10:27 pm

We simply don't need to hypothesise a world (concrete or abstract, contingent or eternally true) to operate a database and the logic behind it. If we give a characteristic predicate for a relvar, that might help authorised users tell us stuff (which will help the enterprise carry out its business functions, presumably). We simply don't need all your metaphysics: at best it's a distraction; at worst it leads people like you (and Fabian and David McG and the argumentative persona on StackOverflow) into deep, dark alleys.

What he said.

But I do agree there is a lingering problem with time. Everyone agrees on something to do with 'time-varying' but not exactly what that means, and this question is at the core in thinking about updates. It's not helped at all by the way TTM treats relvars, relvar assignment or transactions. I'm sure there is a better way, based on a better underlying paradigm.

"Time-varying" is no more or less a problem in TTM than it is in programming in general. The better paradigm (for an undefined value of "better") is, as usual, functional programming (with the usual caveats that accompany a suggestion of functional programming.)

That said, see Time and Relational Theory by Date, Darwen and Lorentzos.

Couldn't have put it better myself. [But the paywall on that book seems prohibitive.]

I may post something.

 

Andl - A New Database Language - andl.org
Quote from Dave Voorhis on December 6, 2019, 11:30 pm
Quote from dandl on December 6, 2019, 11:07 pm
Quote from AntC on December 6, 2019, 10:27 pm

We simply don't need to hypothesise a world (concrete or abstract, contingent or eternally true) to operate a database and the logic behind it. If we give a characteristic predicate for a relvar, that might help authorised users tell us stuff (which will help the enterprise carry out its business functions, presumably). We simply don't need all your metaphysics: at best it's a distraction; at worst it leads people like you (and Fabian and David McG and the argumentative persona on StackOverflow) into deep, dark alleys.

What he said.

But I do agree there is a lingering problem with time. Everyone agrees on something to do with 'time-varying' but not exactly what that means, and this question is at the core in thinking about updates. It's not helped at all by the way TTM treats relvars, relvar assignment or transactions. I'm sure there is a better way, based on a better underlying paradigm.

"Time-varying" is no more or less a problem in TTM than it is in programming in general. The better paradigm (for an undefined value of "better") is, as usual, functional programming (with the usual caveats that accompany a suggestion of functional programming.)

That said, see Time and Relational Theory by Date, Darwen and Lorentzos.

The issue with "time-varying vs. relvars", that is, eschewing variables being available as a fundamental language construct vs. embracing them, is not really what is addressed by that book, is it ?

The book embraces variables, period, and addresses the practical need for "keeping historical data" by "redefining" the scope of what "world" means : the universe in which propositions are evaluated true/false is one that ***includes*** an awareness of all the past situations that the "world" has been in.  Whereas "non-historical data[bases]" have a scope of "world" that does not include any such awareness and where propositions referring to a past situation of the world simply cannot be assessed.  [Of course there is no absolute black and white when it comes to that kind of "awareness".  Only 256 shades of that colour in between.]

Quote from Erwin on December 7, 2019, 6:04 pm
Quote from Dave Voorhis on December 6, 2019, 11:30 pm
Quote from dandl on December 6, 2019, 11:07 pm
Quote from AntC on December 6, 2019, 10:27 pm

We simply don't need to hypothesise a world (concrete or abstract, contingent or eternally true) to operate a database and the logic behind it. If we give a characteristic predicate for a relvar, that might help authorised users tell us stuff (which will help the enterprise carry out its business functions, presumably). We simply don't need all your metaphysics: at best it's a distraction; at worst it leads people like you (and Fabian and David McG and the argumentative persona on StackOverflow) into deep, dark alleys.

What he said.

But I do agree there is a lingering problem with time. Everyone agrees on something to do with 'time-varying' but not exactly what that means, and this question is at the core in thinking about updates. It's not helped at all by the way TTM treats relvars, relvar assignment or transactions. I'm sure there is a better way, based on a better underlying paradigm.

"Time-varying" is no more or less a problem in TTM than it is in programming in general. The better paradigm (for an undefined value of "better") is, as usual, functional programming (with the usual caveats that accompany a suggestion of functional programming.)

That said, see Time and Relational Theory by Date, Darwen and Lorentzos.

The issue with "time-varying vs. relvars", that is, eschewing variables being available as a fundamental language construct vs. embracing them, is not really what is addressed by that book, is it ?

No. Had I not been writing in haste, I would have added that TaRT focuses on chronological or chronologically-significant data, not the technical, conceptual, theoretical or philosophical issues vis-a-vis variable updating.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
Quote from Erwin on October 22, 2019, 11:29 am

By now I'm pretty convinced one of the things Codd had in mind when he wrote about "identifying attributes (/attribute values) by domain name" was exactly for the purpose of "not comparing weights to lengths" and the like.  Which in the most prevailing languages of the day was a very easy thing to do, given that those languages only had "types" 'numeric' and 'text', so to speak (and were primarily focused on physical representation, as you already stated).

Recently, I bumped into Codd 1979 again (the "capture more meaning" paper) and found there :

"Primary domains are important for the support of transactions such as “remove supplier 3 from the database,” in which we wish to remove 3 wherever it occurs as a supplier serial number, but not in any of its other uses."

In light of the discussion above, I believe there can hardly be more convincing evidence that when Codd spoke of "data types", he did not have the thing in mind that modern users of modern programming languages have in mind when they hear/see the term, but rather more of the COBOList approach that has only "numbers" and "text" (both in various physical formats, but NEVER with any connotation of 'this particular number is a supplier Id, and this other particular number is a calendar date').  He even literally prescribed that particular type system in one of his papers.

Quote from Erwin on March 14, 2020, 2:53 pm
Quote from Erwin on October 22, 2019, 11:29 am

By now I'm pretty convinced one of the things Codd had in mind when he wrote about "identifying attributes (/attribute values) by domain name" was exactly for the purpose of "not comparing weights to lengths" and the like.  Which in the most prevailing languages of the day was a very easy thing to do, given that those languages only had "types" 'numeric' and 'text', so to speak (and were primarily focused on physical representation, as you already stated).

Recently, I bumped into Codd 1979 again (the "capture more meaning" paper) and found there :

"Primary domains are important for the support of transactions such as “remove supplier 3 from the database,” in which we wish to remove 3 wherever it occurs as a supplier serial number, but not in any of its other uses."

Interesting.

In light of the discussion above, I believe there can hardly be more convincing evidence that when Codd spoke of "data types", ...

"data type/s" doesn't appear in Codd 1979 AFAICT. There's "types of null" (oh dear), "entity type" and related "property type" and associated "association type" (oh dear, oh dear). "data types"appears once in Codd 1970, also "types of data". Those contexts seem to me talking about variety of entities that are recorded in the database.

So Codd hasn't "spoke[n] of "data types"," and your quotes are adding to the confusion.

he did not have the thing in mind that modern users of modern programming languages have in mind when they hear/see the term, but rather more of the COBOList approach that has only "numbers" and "text" (both in various physical formats, but NEVER with any connotation of 'this particular number is a supplier Id, and this other particular number is a calendar date').

I see no evidence Codd was even thinking of something that concrete. He has examples with numbers and text, and dates. I don't see that to be saying that's all any database needs. In the 1970 and 1979 papers at least I can't see anything to support your reading. (1979 has a couple of mentions of 'syntactic type' of a value hmmm. In sections of the paper that are usually ignored, and for good reason.) I'd say he just didn't care. And why should he? Datatype is orthogonal to model.

He even literally prescribed that particular type system in one of his papers.

OK which paper is that? Did he use the words "type system"? I'm surprised he would know what is a 'type system' in the modern sense (although that sense was emerging in the 1960's/'70's from academic programming languages). [See Addit:]

My point in starting this thread was to reject the attribution coming from Chris Date that Codd's 'domain' we should take to mean modern 'data type'. I don't think it's that Codd merely didn't know the term in that sense and would have used it if he did. I think his 'domain' is something different, and that Chris Date is plain wrong. Indeed Chris Date seems to be persistently wrong in a great number of his readings of other authors, and in bending well-established terminology to weird senses.

I'm not necessarily saying Codd's 'domain' is a concept we should be embracing in Relational Theory today. I do say that equating his 'domain' with modern 'datatype' is weakening the understanding of what Codd was saying (in the 1969/1970/1972 papers).

Addit: ah 1972 Relational Completeness section 2.2 Introductory Definitions has

For data base purposes, we are concerned with data consisting of integers and character strings (other types of primitive elements may be included in this definition if desired, with only minor changes in some of the definitions below).

That doesn't read like "prescribing that particular type system". And the "other types ... may be included" says he doesn't envisage "only "numbers" and "text" ". So Codd is just getting the issue out of the way so he can adumbrate the algebra and calculus. Most of the paper uses 'domain'.

My point in starting this thread was to reject the attribution coming from Chris Date that Codd's 'domain' we should take to mean modern 'data type'. I don't think it's that Codd merely didn't know the term in that sense and would have used it if he did. I think his 'domain' is something different, and that Chris Date is plain wrong. Indeed Chris Date seems to be persistently wrong in a great number of his readings of other authors, and in bending well-established terminology to weird senses.

AFAICT Codd (early papers)  used 'domain' precisely in this sense: https://en.wikipedia.org/wiki/Data_domain: "the values which a data element may contain". The same word is frequently used in precisely the same sense today, by data people (DP) in talking about data. Codd used this valuable idea to distinguish those attributes that could join from those that could not.

In Codd's time, code people (CP) had a very limited idea of programming types, just barely above the physical representation. Code was thinking as a DP not a CP, so definitely not contemplating a programming language type. That has changed, but SQL is stuck in a time warp. Most of the work in an ORM is about converting between the DP view and the CP view.

IMO the single most striking feature of  TTM is the unification of data domain with programming language type system. It's the database you get when the CP takes over. But it causes immense difficulty for the language, incorporating these type-generated tuple and relation types. It places the solution beyond reach of virtually all common programming languages, which is bad. There are other issues in TTM, such as the treatment of enumerated types, which are a big deal to DP. The data domain is obvious (Red, Amber, Green), but the programming type is problematic. Is it possible that the DP view and the CP view should not be unified?

So my question for TFM (The Fourth Manifesto) is whether it is feasible to put some distance between the data domain and the language type, so both the DP and the CP get what they need, and if there are benefits in doing so.

 

Andl - A New Database Language - andl.org
PreviousPage 15 of 22Next