The Forum for Discussion about The Third Manifesto and Related Matters

Please or Register to create posts and topics.

TTM without tuples

PreviousPage 4 of 9Next
Quote from Vadim on April 14, 2021, 4:54 pm
Quote from Hugh on April 14, 2021, 1:11 pm

A relation's body is a set of tuples.

A relation's heading is a set of attributes.

Did the questioner have some other notion of what an attribute is?

 

As you have mentioned, a relation is not really a set, but a pair of header and body, each considered to be a set. Again, this traditional view has been challenged (or complemented) by the "columnar" database community, which considers a relation as a set of columns. There has to be conforming condition when "gluing" columns to construct relations of higher arity, of course, but the situation is similar to requiring tuples to match in conventional database theory.

Hi Vadim, good to hear from you.

I think all we require is that any concrete implementation of a relation value be isomorphic to TTM's definition. A set of tuples that each contain the same set of Attributes paired with values of the same/corresponding type is isomorphic to a set of columns, each with the same number of rows/elements, with some mechanism (as you say) to make those elements correspond 'across' columns.

I'll take one more step, and suggest that using "a set of" in database foundation is problematic. First of all, tuple- and column- perspectives clearly disagree what the set structure of a relation is.

No, they're two differing representations that are isomorphic. If you like, we can say the representations are 'information equivalent'.

Second, mathematics field gradually shifted from emphasizing object's set structure to categorical view.

No, again in the case of relations, set vs categorical view are isomorphic. And where I come from, Zermelo-Frankel is still perfectly acceptable whereas categorical mathematics is impossibly abstruse. Category Theory is becoming more popular in Programming Language Theory because arrows correspond better to functions/transformations on data, and objects correspond to types and abstract properties of types. But down in the guts of a Programming Language we still have values and data structures.

Technically speaking, only some categories are structured sets. It would be interesting to formalize database relations in terms of objects and arrows, and I wonder if David Spivak's database category is the right one.

TTM's specification still stands (as an abstraction). As opposed to (say) SQL's 'specification' (I use the word loosely), which is still a mess. As opposed to McGoveran's objections, which are incoherent. It might be interesting to formalise as objects and arrows, but if that formalisation is not isomorphic to TTM's, it's wrong.

Quote from dandl on April 15, 2021, 12:29 am
Quote from Hugh on April 14, 2021, 1:11 pm

Can dandl please remind me, what is the problem you are trying to solve?

To my mind, inclusion of "non-first-class" types in a language is a complication, not a simplification.

Btw, I saw somebody questioning whether a relation is a set of tuples or a set of attributes.  It made me realise that I had been guilty of a bit of looseness when I wrote that it is a set of tuples.

A relation's body is a set of tuples.

A relation's heading is a set of attributes.

Did the questioner have some other notion of what an attribute is?

I dread the thought of changing TTM to avoid those definitions.

Also btw, a heading is a relation, an attribute is a tuple, and a tuple is a relation.  True, but that way madness lies (as I believe CJD would have put it).

Sorry for butting in without reading all the correspondence.

Hugh

My proposal is simply to abandon the need for a TUPLE type in TTM and in a D language. That leaves a relation with a heading and a body which is a set of tuples, just as before. You need a new selector definition for RELATION, with scalars as arguments, and you need a change to ARRAY for getting data out of a relation.

This change has absolutely no impact on the expressiveness of the D language. It does simplify the implementation of a D based on modern GP languages. That's all.

 

Thank you, dandl.  So the problem is to do with implementation, given that you are subject to self-imposed constraints that did not affect the authors of, e.g., Sira_prise and Rel.  In that case I have no further comment except to recall that Business System 12 had no defined type for its tuples (which were called "rows" but they did adhere to TTM's definition of tuple).  It's perhaps interesting to note that the promotion of standard SQL's rows to first-class-type status didn't happen until 1998, indicating a regret that they hadn't had that status in the first place.

Hugh

Coauthor of The Third Manifesto and related books.
Quote from AntC on April 15, 2021, 12:26 am
Quote from Hugh on April 14, 2021, 1:11 pm

... Btw, I saw somebody questioning whether a relation is a set of tuples or a set of attributes.  It made me realise that I had been guilty of a bit of looseness when I wrote that it is a set of tuples.

A relation's body is a set of tuples.

A relation's heading is a set of attributes.

Did the questioner have some other notion of what an attribute is?

I dread the thought of changing TTM to avoid those definitions.

Thanks Hugh, I think people said a number of things up-thread to try to emphasise how wrong was dandl's interpretation. There might have been quite a bit of "looseness". I take the definitions in TTM to be an abstraction. Some concrete implementation of a D might not have separately-identifiable components of a relation value (data structure) that exactly correspond.

OTOH your following remarks I'm finding hard to follow:

Also btw, a heading is a relation, an attribute is a tuple, and a tuple is a relation.  True, but that way madness lies (as I believe CJD would have put it).

A heading ("is a relation" so) has a body and a heading? Then that heading must have a heading ... ad infinitum?

An attribute (you mean an <A, T, v> triple?) "is a tuple"? Then what is the heading of that tuple? Note that in the Tutorial D TUPLE{ ... } Selector, each element must include an A and a v, but the syntax doesn't require a T -- you can give one as a type annotation, but that's a ubiquitous feature, not specific to tuples.

A tuple "is a relation"? Then in Tutorial D I can denote a tuple value using the RELATION{ ... } Selector? And that would be different to a singleton relation? I don't follow.

Sorry for butting in without reading all the correspondence.

There has been a huge volume of correspondence, much of it repetitive and argumentative without shedding light. I'm not reading it all either. I'm not feeling any need to apologise.

Yes, Antc, "ad infinitum" is the reason for my citing that remark I have seen before from CJD ("that way madness lies").

I might have made it clearer that I meant "relation" and "tuple" in the mathematical sense, at least, but I decided that was unnecessary because in theory one can always choose a TTM-style heading (ad infinitum?).

I know of at least two textbooks on relational database theory that define a tuple as a function, mapping attribute names to values.  A function is a special case of a relation.  As for heading as relation, well, we implemented that in Business System 12, where COLUMNS(t) returned a table representing the definitions of the columns of table t "column" = attribute, "table" = relation).  It always amused us that COLUMNS(COLUMNS(t)) was a constant.  Its tabular display showed the same column names along the top and down the left-hand side.

Hugh

Coauthor of The Third Manifesto and related books.
Quote from Hugh on April 15, 2021, 10:43 am
Quote from AntC on April 15, 2021, 12:26 am
Quote from Hugh on April 14, 2021, 1:11 pm

... Btw, I saw somebody questioning whether a relation is a set of tuples or a set of attributes.  It made me realise that I had been guilty of a bit of looseness when I wrote that it is a set of tuples.

A relation's body is a set of tuples.

A relation's heading is a set of attributes.

Did the questioner have some other notion of what an attribute is?

I dread the thought of changing TTM to avoid those definitions.

Thanks Hugh, I think people said a number of things up-thread to try to emphasise how wrong was dandl's interpretation. There might have been quite a bit of "looseness". I take the definitions in TTM to be an abstraction. Some concrete implementation of a D might not have separately-identifiable components of a relation value (data structure) that exactly correspond.

OTOH your following remarks I'm finding hard to follow:

Also btw, a heading is a relation, an attribute is a tuple, and a tuple is a relation.  True, but that way madness lies (as I believe CJD would have put it).

A heading ("is a relation" so) has a body and a heading? Then that heading must have a heading ... ad infinitum?

An attribute (you mean an <A, T, v> triple?) "is a tuple"? Then what is the heading of that tuple? Note that in the Tutorial D TUPLE{ ... } Selector, each element must include an A and a v, but the syntax doesn't require a T -- you can give one as a type annotation, but that's a ubiquitous feature, not specific to tuples.

A tuple "is a relation"? Then in Tutorial D I can denote a tuple value using the RELATION{ ... } Selector? And that would be different to a singleton relation? I don't follow.

Sorry for butting in without reading all the correspondence.

There has been a huge volume of correspondence, much of it repetitive and argumentative without shedding light. I'm not reading it all either. I'm not feeling any need to apologise.

Yes, Antc, "ad infinitum" is the reason for my citing that remark I have seen before from CJD ("that way madness lies").

I might have made it clearer that I meant "relation" and "tuple" in the mathematical sense, at least, but I decided that was unnecessary because in theory one can always choose a TTM-style heading (ad infinitum?).

I know of at least two textbooks on relational database theory that define a tuple as a function, mapping attribute names to values.

Yes, but that's not how TTM defines "tuple". A TTM tuple is isomorphic to such a function.

  A function is a special case of a relation.

Yes, but that's the mathematician's sense of 'relation'; not the TTM definition of relation -- or do I mean RELATION H for some H ?

As for heading as relation, well, we implemented that in Business System 12, where COLUMNS(t) returned a table representing ...

Thank you Hugh, but with respect, you wrote "a heading is a relation" (my emphasis). Now you are making a weaker claim that a heading can be represented as a relation -- with which I wouldn't disagree. And that's merely because any structured information can be represented as a relation. That's what it means for information to be structured. We might say that a heading is isomorphic to a relation. And TTM requires that morphism, to store a heading into the DBMS's catalogue.

We don't in Tutorial D write

SP := RELATION {RELATION{TUPLE{A S#, T S#}, TUPLE{A P#, T P#}, TUPLE{A QTY, T INT}}}
                        {TUPLE{  S# 'S123',         P# 'P456',         QTY    50}}

Because then we might reasonably want to give a heading for the inner RELATION{ ... }, ad infinitum. A TTM heading is a different type-of-type vs a tuple or a relation's body.

the definitions of the columns of table t "column" = attribute, "table" = relation).  It always amused us that COLUMNS(COLUMNS(t)) was a constant.  Its tabular display showed the same column names along the top and down the left-hand side.

 

Most of dandl's obfuscation on this thread and related ones has been mis-applying terminology. In particular the thread's title "without tuples" turns out to mean more like 'without a first-class tuple type or tuple-specific syntax'. using the bare verb "is" as you did will only increase obfuscation.

Quote from dandl on April 15, 2021, 12:29 am

My proposal is simply to abandon the need for a TUPLE type in TTM and in a D language. That leaves a relation with a heading and a body which is a set of tuples, just as before. You need a new selector definition for RELATION, with scalars as arguments, and you need a change to ARRAY for getting data out of a relation.

This change has absolutely no impact on the expressiveness of the D language. It does simplify the implementation of a D based on modern GP languages. That's all.

 

No.  Because TTM allows TVA's and RVA's too.  So "with scalars as arguments" won't do.  That is, unless you also want to subscribe to what I call "Codd's second great blunder" - though historically it might have been the one that was committed first (disallowing RVA's that is, if anyone didn't get it).

Your "new selector definition" is (as far as I can see) ***exactly*** the one SIRA_PRISE had prior to 1.5, and it was changed for a reason.  (It was there for a reason too, but that reason was very mundanely "I have only two hands".)

Quote from Erwin on April 15, 2021, 12:49 pm
Quote from dandl on April 15, 2021, 12:29 am

My proposal is simply to abandon the need for a TUPLE type in TTM and in a D language. That leaves a relation with a heading and a body which is a set of tuples, just as before. You need a new selector definition for RELATION, with scalars as arguments, and you need a change to ARRAY for getting data out of a relation.

This change has absolutely no impact on the expressiveness of the D language. It does simplify the implementation of a D based on modern GP languages. That's all.

No.  Because TTM allows TVA's and RVA's too.  So "with scalars as arguments" won't do.  That is, unless you also want to subscribe to what I call "Codd's second great blunder" - though historically it might have been the one that was committed first (disallowing RVA's that is, if anyone didn't get it).

Well, not quite. The TTM type system includes relation types on the same footing as scalars, and RVAs turn out to be really useful. SQL, of course, does not. But there is no such benefit with TVAs, which are excluded in my proposal.

Bu I don't get the other bit. All I'm saying is that the selector goes from scalar types to relation types with no tuple types in between. What are you saying?

Your "new selector definition" is (as far as I can see) ***exactly*** the one SIRA_PRISE had prior to 1.5, and it was changed for a reason.  (It was there for a reason too, but that reason was very mundanely "I have only two hands".)

It was changed because TTM said so. I say: better not.

Andl - A New Database Language - andl.org
Quote from dandl on April 15, 2021, 2:13 pm

Your "new selector definition" is (as far as I can see) ***exactly*** the one SIRA_PRISE had prior to 1.5, and it was changed for a reason.  (It was there for a reason too, but that reason was very mundanely "I have only two hands".)

It was changed because TTM said so. I say: better not.

It was changed because of how much easier and succinct it gets to write down all the rules that a valid relation selector must obey (note that if the language doesn't have tuple types, then you can't say in the language spec things like "the BODY portion of the selector must consist exclusively of valid tuple-valued expressions" because such expressions simply cannot exist).  Though I can easily see how lots of people would simply gloss over that and handwave that that is not a problem.

Quote from Vadim on April 14, 2021, 4:54 pm

As you have mentioned, a relation is not really a set, but a pair of header and body, each considered to be a set. Again, this traditional view has been challenged (or complemented) by the "columnar" database community, which considers a relation as a set of columns. There has to be conforming condition when "gluing" columns to construct relations of higher arity, of course, but the situation is similar to requiring tuples to match in conventional database theory.

I'll take one more step, and suggest that using "a set of" in database foundation is problematic. First of all, tuple- and column- perspectives clearly disagree what the set structure of a relation is. Second, mathematics field gradually shifted from emphasizing object's set structure to categorical view. Technically speaking, only some categories are structured sets. It would be interesting to formalize database relations in terms of objects and arrows, and I wonder if David Spivak's database category is the right one.

I'm wondering what the "guaranteed access rule" would look like in such a "columnar perspective".  You cannot in any way refer to "that unique tuple which has some particular given [combination of] key attribute value[s] because both "combinations of key values" and tuples themselves are inherently undefinable and can only be introduced if you first reconstitute the relation [or the relational structure] from the "columns".

My proposal is simply to abandon the need for a TUPLE type in TTM and in a D language. That leaves a relation with a heading and a body which is a set of tuples, just as before. You need a new selector definition for RELATION, with scalars as arguments, and you need a change to ARRAY for getting data out of a relation.

This change has absolutely no impact on the expressiveness of the D language. It does simplify the implementation of a D based on modern GP languages. That's all.

 

Thank you, dandl.  So the problem is to do with implementation, given that you are subject to self-imposed constraints that did not affect the authors of, e.g., Sira_prise and Rel.  In that case I have no further comment except to recall that Business System 12 had no defined type for its tuples (which were called "rows" but they did adhere to TTM's definition of tuple).  It's perhaps interesting to note that the promotion of standard SQL's rows to first-class-type status didn't happen until 1998, indicating a regret that they hadn't had that status in the first place.

Just so. I don't actually know of any language that has TTM-like tuples exposed as a type in the language (other than Tutorial D). Presumably, as in your case, there was simply no problem that it solved.

 

Andl - A New Database Language - andl.org

I agree with David Bennet - - dandl - that tuples as a 'mathematical sort', or category of data type, are unnecessary and should be omitted.

(My apologies for revisiting this topic such a long time after it was first raised. Domestic and local obligations have diverted me from responding to the TTM Forum for several months).

When I was developing the RAQUEL notation, I couldn't find a use for tuples as a separate data sort, either in principle or in practice. So I left tuples out, and it's never been a problem.

I can no more think of relations and tuples as separate, orthogonal and independent sorts of data than I can of arrays and array elements as separate, orthogonal and independent sorts of data.

Arrays and array elements are 2 sides of the same coin; you need both to fully understand what an array is. Likewise relations and tuples are 2 sides of the same coin; in fact we also need a 3rd 'side', namely attributes, to fully understand what a relation is. Both tuples and attributes are innate parts of a relation. Neither of them appear on their own outside a relation.

Attributes and tuples provide 2 perspectives by which relations can be considered. (An array only has one perspective, which uses a set of indexes - one per dimension - to reference array elements). A relation's attribute perspective uses names to reference its attributes; its tuple perspective uses key values to reference its tuples. As the 2 perspectives are orthogonal to each other, they can be combined to reference a sub relational value.

I regard it as advantageous to omit tuples. Fred Brooks points out that "Because ease of use is the purpose, this ratio of function to conceptual complexity is the ultimate test of system design" (quote - "The Mythical Man-Month", chapter 4, section 'Achieving Conceptual Integrity'). It's not maximising functionality alone or simplicity alone that matters, but maximising the ratio of functionality to conceptual complexity.

By omitting tuples, I have only 2 sorts of values in RAQUEL, scalars and relvars. If I'd included tuples, they would have formed a 3rd sort, thereby reducing the ratio of function to conceptual complexity, since there is no gain in functionality from including the tuple sort. Furthermore :

  • Is not type coercion required when inserting a tuple into a relation (albeit perhaps implicitly) ?
  • For symmetry, would I not also have to add attributes as a 4th sort, thereby further reducing the ratio, and adding another type coercion ?

The only practical problem is how to expression literal relation values. Attributes and tuples are inherent in relations, so their values need to be expressed innately within the relational value in order to conform with the relation's structure (just as array elements need to be expressed within an array's value).

For consistency and symmetry, the formal relational model should permit literal relational values to be expressed via either perspective, tuples or attributes. In TTM terms, the relational selector operator should handle both alternatives.

Alas that hadn't struck me before I read this topic in the TTM Forum. RAQUEL currently only permits literal relation values to be expressed via tuples. (I think it corresponds to Erwin's contribution #22 on 13 April 2021).

Looking back, I wonder why I only expressed relation values via tuple values. Probably because that's what everybody else did. Why do we have this psychological preferance ? Is it because literal attribute values are trickier to express than literal tuple values ? Literal attribute values are bags of values - if the attribute is a key, then the bag is constrained to contain no duplicate values. Also attribute values must be 'glued together' appropriately to form a relation value. (See Vadim's contribution #28 on 14 April 2021).

PreviousPage 4 of 9Next