The Forum for Discussion about The Third Manifesto and Related Matters

Please or Register to create posts and topics.

TTM without tuples

PreviousPage 5 of 9Next
Quote from David Livingstone on June 30, 2021, 12:00 pm

I agree with David Bennet - - dandl - that tuples as a 'mathematical sort', or category of data type, are unnecessary and should be omitted.

(My apologies for revisiting this topic such a long time after it was first raised. Domestic and local obligations have diverted me from responding to the TTM Forum for several months).

Some topics can only improve over time. And yes, sometimes IRL gets in the way.

When I was developing the RAQUEL notation, I couldn't find a use for tuples as a separate data sort, either in principle or in practice. So I left tuples out, and it's never been a problem.

I can no more think of relations and tuples as separate, orthogonal and independent sorts of data than I can of arrays and array elements as separate, orthogonal and independent sorts of data.

Most collections are constructed out of single elements of some type, and the focus is on operations on the element. If there are operations on the collection it's in addition.

Strings are different. They are immutable (in most languages), and come with a rich set of operations. The language may provide a character type, but some do not.

IMO relations should be more like strings. Tuples are an optional extra.

Arrays and array elements are 2 sides of the same coin; you need both to fully understand what an array is. Likewise relations and tuples are 2 sides of the same coin; in fact we also need a 3rd 'side', namely attributes, to fully understand what a relation is. Both tuples and attributes are innate parts of a relation. Neither of them appear on their own outside a relation.

Attributes and tuples provide 2 perspectives by which relations can be considered. (An array only has one perspective, which uses a set of indexes - one per dimension - to reference array elements). A relation's attribute perspective uses names to reference its attributes; its tuple perspective uses key values to reference its tuples. As the 2 perspectives are orthogonal to each other, they can be combined to reference a sub relational value.

I regard it as advantageous to omit tuples. Fred Brooks points out that "Because ease of use is the purpose, this ratio of function to conceptual complexity is the ultimate test of system design" (quote - "The Mythical Man-Month", chapter 4, section 'Achieving Conceptual Integrity'). It's not maximising functionality alone or simplicity alone that matters, but maximising the ratio of functionality to conceptual complexity.

You're playing into my safer theme. Ease of use does not allow that a program compiles but will not run because of a lack of programmer insight or precaution. Brooks was a clever man.

By omitting tuples, I have only 2 sorts of values in RAQUEL, scalars and relvars. If I'd included tuples, they would have formed a 3rd sort, thereby reducing the ratio of function to conceptual complexity, since there is no gain in functionality from including the tuple sort. Furthermore :

  • Is not type coercion required when inserting a tuple into a relation (albeit perhaps implicitly) ?
  • For symmetry, would I not also have to add attributes as a 4th sort, thereby further reducing the ratio, and adding another type coercion ?

The only practical problem is how to expression literal relation values. Attributes and tuples are inherent in relations, so their values need to be expressed innately within the relational value in order to conform with the relation's structure (just as array elements need to be expressed within an array's value).

I agree. It's seductive, the idea of literal construction via tuple literals, but it's misleading.

For consistency and symmetry, the formal relational model should permit literal relational values to be expressed via either perspective, tuples or attributes. In TTM terms, the relational selector operator should handle both alternatives.

Good point. Andl has a relation literal like this: rel{X:int,Y:int}((1,4),(2,5),(3,6))

But it could be: rel{X:int,Y:int}{X:(1,2,3), Y:(4,5,6))

Alas that hadn't struck me before I read this topic in the TTM Forum. RAQUEL currently only permits literal relation values to be expressed via tuples. (I think it corresponds to Erwin's contribution #22 on 13 April 2021).

Looking back, I wonder why I only expressed relation values via tuple values. Probably because that's what everybody else did. Why do we have this psychological preferance ? Is it because literal attribute values are trickier to express than literal tuple values ? Literal attribute values are bags of values - if the attribute is a key, then the bag is constrained to contain no duplicate values. Also attribute values must be 'glued together' appropriately to form a relation value. (See Vadim's contribution #28 on 14 April 2021).

Definitely food for thought.

Andl - A New Database Language - andl.org
Quote from David Livingstone on June 30, 2021, 12:00 pm

(My apologies for revisiting this topic such a long time after it was first raised. Domestic and local obligations have diverted me from responding to the TTM Forum for several months).

Welcome back David. I'd be hoping that after cogitating on this thread, you might help explain what's at issue. Because I still don't get it. It seems to be making a distinction without a difference. A couple of reminders from the TTM doco:

  • [from RM Pre 6] "Note: When we say “the name of [a certain tuple type] shall be, precisely, TUPLE H,” we do not mean to prescribe specific syntax. The Manifesto does not prescribe syntax."
  • [from RM Pre 4] "Physical representations for values of type T shall be specified by means of some kind of storage structure definition language and shall not be visible in D."

What dogged the previous discussion was a) the idea that avoiding a keyword TUPLE was thereby avoiding the concept or type of a tuple in a D; b) that storing the body of a relation in some form such that there was no contiguous patch of memory/disk representing a tuple was thereby avoiding there being something in the semantics of type tuple.

I agree with David Bennet - - dandl - that tuples as a 'mathematical sort', or category of data type, are unnecessary and should be omitted.

So if you have a relation body of cardinality two, what's the category/data type that you have two of? How does that body differ from one with same heading cardinality three? How does a relvar with a body with cardinality two degree four differ from a relation value's body with degree three such that i) you can't assign the latter to the relvar; ii) you can't compare them for equality; iii) by "can't" I mean you typically get a static type error.

If you mean you don't want your D to have vars of type tuple, or support stand-alone expressions representing a tuple not inside a relation value -- well ok, but you haven't avoided "a 'mathematical sort' or category of data type".

 

When I was developing the RAQUEL notation, I couldn't find a use for tuples as a separate data sort, either in principle or in practice. So I left tuples out, and it's never been a problem.

I can no more think of relations and tuples as separate, orthogonal and independent sorts of data than I can of arrays and array elements as separate, orthogonal and independent sorts of data.

Nonsense. I can think of Chars, Booleans, Floats, Ints without having to think of arrays of those. The cell's type is orthogonal to the array structure. I can think of Ints, Chars, user-defined enumerated types (like Clubs, Diamonds, Hearts, Spades) without having to think of those indexing arrays. An array's structure (type) is then composed of (typically multiple) index types and some cell type. (Perhaps you're not aware that since the 1980's, array indexes can be richer-typed than merely Ints? Perhaps before then, I think Pascal supported various types for indexes.)

There are programming languages (Algol68, BCPL) that support expressing a two-dimensional array as a vector of vectors. Parallel to a set of tuples. I plain disagree with the following:

Arrays and array elements are 2 sides of the same coin; you need both to fully understand what an array is. Likewise relations and tuples are 2 sides of the same coin; in fact we also need a 3rd 'side', namely attributes, to fully understand what a relation is. Both tuples and attributes are innate parts of a relation. Neither of them appear on their own outside a relation.

Attributes and tuples provide 2 perspectives by which relations can be considered. (An array only has one perspective, which uses a set of indexes - one per dimension - to reference array elements). A relation's attribute perspective uses names to reference its attributes; its tuple perspective uses key values to reference its tuples. As the 2 perspectives are orthogonal to each other, they can be combined to reference a sub relational value.

I regard it as advantageous to omit tuples. Fred Brooks points out that "Because ease of use is the purpose, this ratio of function to conceptual complexity is the ultimate test of system design" (quote - "The Mythical Man-Month", chapter 4, section 'Achieving Conceptual Integrity'). It's not maximising functionality alone or simplicity alone that matters, but maximising the ratio of functionality to conceptual complexity.

You've just talked about the two perspectives. (And let me be careful here to avoid a spatial/column-and-rows perspective.) Then conceptually accessing a relation value by-attribute (projection) is quite different to accessing by attribute-content (restriction). If you can't talk about the what-they-are that have QTY = 50 but have various S#'s, you're going to get in a tangle.

By omitting tuples, I have only 2 sorts of values in RAQUEL, scalars and relvars.

How do you express a relation literal? Do you have a comma-list style? What do you call the doo-hickeys between the outer-level commas? What is it that distinguishes the relation literal for DEE vs DUM?

If I'd included tuples, they would have formed a 3rd sort, thereby reducing the ratio of function to conceptual complexity, since there is no gain in functionality from including the tuple sort. Furthermore :

  • Is not type coercion required when inserting a tuple into a relation (albeit perhaps implicitly) ?

Of course not. There's (static) type checking, that's not coercion. Or ... perhaps you mean that the attribute order in a to-insert tuple might be different to in the relvar. Neither tuples nor relvars have ordering of attributes. Which is the point. (BTW you don't insert anything into a relation. You're not taking enough TTM kool-aid.) Your DML's syntax might be that what you insert into a relvar is a relation value -- then you must wrap a tuple in some sort of RELATION{ } constructor/selector. That's not 'coercion'.

  • For symmetry, would I not also have to add attributes as a 4th sort, thereby further reducing the ratio, and adding another type coercion ?

Yes attributes are a distinct sort vs relations vs tuples vs arrays vs enumerated types vs product types vs ... No this doesn't involve type coercion. Composing a type out of component types is not 'coercion'. You don't seem to understand the term.

The only practical problem is how to expression literal relation values. Attributes and tuples are inherent in relations,

Errm. You just used the word "tuples". Where has Fred Brooks gone? If you need to talk about 'tuples' as being inherent in relations, you haven't saved any conceptual ratio. You've just made the conversation more complex by taking away the very word for what we need to talk about.

so their values need to be expressed innately within the relational value in order to conform with the relation's structure (just as array elements need to be expressed within an array's value).

For consistency and symmetry, the formal relational model should permit literal relational values to be expressed via either perspective, tuples or attributes. In TTM terms, the relational selector operator should handle both alternatives.

So far you've given us a lot of words (a lot of it waffle, frankly) with no concrete syntactic constructs. I've no idea from what you've said so far what it would look like by "either perspective". How could I represent QTY 50 for S# S1 vs QTY 100 for S# S2 without expressing some mechanism to denote the correspondence?

Alas that hadn't struck me before I read this topic in the TTM Forum. RAQUEL currently only permits literal relation values to be expressed via tuples. (I think it corresponds to Erwin's contribution #22 on 13 April 2021).

Looking back, I wonder why I only expressed relation values via tuple values. Probably because that's what everybody else did. Why do we have this psychological preferance ? Is it because literal attribute values are trickier to express than literal tuple values ? Literal attribute values are bags of values - if the attribute is a key, then the bag is constrained to contain no duplicate values. Also attribute values must be 'glued together' appropriately to form a relation value. (See Vadim's contribution #28 on 14 April 2021).

I don't understand what you might mean by 'literal attribute values', then so far yes that seems a lot more mysterious.

My psychological preference for tuples follows naturally from predicates in natural language: Supplier S1 supplies Part P5 in quantities of 50.

What's the natural language expression for 'literal attribute values'? (Whatever you mean by that.) ?? Quantities 50, 100, 73 belong with ??

Quote from dandl on July 1, 2021, 5:23 am

I agree. It's seductive, the idea of literal construction via tuple literals, but it's misleading.

What continues to perplex me is why you think a syntax for literals that merely avoids a token TUPLE, or avoids attribute name-value pairs (as below) is not "via tuples".

Andl has a relation literal like this: rel{X:int,Y:int}((1,4),(2,5),(3,6))

So you're expressing tuples with a different syntax as contrast to Tutorial D. Since TTM "does not prescribe syntax", why/how do you think you're avoiding tuples? If I want to talk about the value denoted by (2, 5), or it's type, what do I call that? If that rel ... expression represents a relation value of cardinality three, that's three what?

I can see nothing about positional notation that's an improvement on tuples. It's just too easy to mess up the positioning. And since in this example your attributes are of same type, it's impossible for the compiler to help you out through type-checking.

But it could be: rel{X:int,Y:int}{X:(1,2,3), Y:(4,5,6))

That looks even more at risk of messing up.

Anyhoo, I'm not seeing there's much need for expressing relation literals in a D. Most of the relational operators take relations and/or attribute names as operands, and return relation values as result. Why/how is there much of a need to express tuples? Never the less, thinking in terms of tuples is intrinsic to understanding those operations. What are you saving by banning me from talking about tuples?

Quote from AntC on July 1, 2021, 7:19 am
Quote from dandl on July 1, 2021, 5:23 am

I agree. It's seductive, the idea of literal construction via tuple literals, but it's misleading.

What continues to perplex me is why you think a syntax for literals that merely avoids a token TUPLE, or avoids attribute name-value pairs (as below) is not "via tuples".

Yes.

Misunderstanding syntax vs semantics -- particularly that excising the former can leave the latter wholly intact -- is very common, and seems to be a problem in this thread.

I'm not clear if the intent is to remove the TUPLE keyword from Tutorial D (sure, you could do that... I guess... but the semantics don't change) or any/all D languages; remove tuples from the TTM relational model (huh?!); or something else. In any case, what does it gain?

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org

Let me make it perfectly clear. My proposal is remove 'tuple' types from TTM (as a language feature of D -- see Rm Pre 6).

That means no variables, operators, attributes, components or parameters of that type, and no nesting or unnesting.

The only other change required is Rm Pre 10 b1. IMO the simple solution is to use the same wording as for scalars: "Every argument expression in every invocation of S shall be a literal.". Too easy.

We're done. That's all there is. We lost nothing, we gained simplicity and purity. And better compliance with RM Pro 7.

Andl - A New Database Language - andl.org

So if you have a relation body of cardinality two, what's the category/data type that you have two of? How does that body differ from one with same heading cardinality three? How does a relvar with a body with cardinality two degree four differ from a relation value's body with degree three such that i) you can't assign the latter to the relvar; ii) you can't compare them for equality; iii) by "can't" I mean you typically get a static type error.

Tuple must still exist conceptually, but not as a required type per RM Pre 6.

By omitting tuples, I have only 2 sorts of values in RAQUEL, scalars and relvars.

How do you express a relation literal? Do you have a comma-list style? What do you call the doo-hickeys between the outer-level commas? What is it that distinguishes the relation literal for DEE vs DUM?

A literal may take whatever form the implementor decides.

The rest of this response is verbal jousting, which conceals the essential simplicity of my proposal.

Andl - A New Database Language - andl.org
Quote from AntC on July 1, 2021, 7:19 am
Quote from dandl on July 1, 2021, 5:23 am

I agree. It's seductive, the idea of literal construction via tuple literals, but it's misleading.

What continues to perplex me is why you think a syntax for literals that merely avoids a token TUPLE, or avoids attribute name-value pairs (as below) is not "via tuples".

Andl has a relation literal like this: rel{X:int,Y:int}((1,4),(2,5),(3,6))

So you're expressing tuples with a different syntax as contrast to Tutorial D. Since TTM "does not prescribe syntax", why/how do you think you're avoiding tuples? If I want to talk about the value denoted by (2, 5), or it's type, what do I call that? If that rel ... expression represents a relation value of cardinality three, that's three what?

I can see nothing about positional notation that's an improvement on tuples. It's just too easy to mess up the positioning. And since in this example your attributes are of same type, it's impossible for the compiler to help you out through type-checking.

But it could be: rel{X:int,Y:int}{X:(1,2,3), Y:(4,5,6))

That looks even more at risk of messing up.

I agree. Those were just in response.

Anyhoo, I'm not seeing there's much need for expressing relation literals in a D. Most of the relational operators take relations and/or attribute names as operands, and return relation values as result. Why/how is there much of a need to express tuples? Never the less, thinking in terms of tuples is intrinsic to understanding those operations. What are you saving by banning me from talking about tuples?

I agree. In practice there is no reason except for orthogonality: the scalar types require "2. Every value of type T shall be produced by some invocation of S.".

My preference is that the wording for relation and scalar types should be more similar than they are now.

Andl - A New Database Language - andl.org
Quote from dandl on July 1, 2021, 10:18 am

Let me make it perfectly clear. My proposal is remove 'tuple' types from TTM (as a language feature of D -- see Rm Pre 6).

That means no variables, operators, attributes, components or parameters of that type, and no nesting or unnesting.

The only other change required is Rm Pre 10 b1. IMO the simple solution is to use the same wording as for scalars: "Every argument expression in every invocation of S shall be a literal.". Too easy.

We're done. That's all there is. We lost nothing, we gained simplicity and purity. And better compliance with RM Pro 7.

It seems you lose some generality (if tuples are inside relations, why aren't they outside?) -- and practical use of a nice facility. Standalone tuples are useful.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org

You'll have to provide an example if you want to make that claim. I know of none.

Tuple values have always been second class in TTM. They are defined in passing in Pre 9 and as a note to Pre 18, but they read to me almost as an afterthought, something added just to fill in some blanks. As far as I can tell, whatever you can do with a tuple you can do equally well with a relation of cardinality 1. But I may have missed something.

 

Andl - A New Database Language - andl.org
Quote from dandl on July 1, 2021, 1:44 pm

You'll have to provide an example if you want to make that claim. I know of none.

Tuple values have always been second class in TTM. They are defined in passing in Pre 9 and as a note to Pre 18, but they read to me almost as an afterthought, something added just to fill in some blanks. As far as I can tell, whatever you can do with a tuple you can do equally well with a relation of cardinality 1. But I may have missed something.

A relation of cardinality 1 is a wrapper around a tuple, by definition. So why not have just a tuple?

It's sometimes convenient to be able to do this:

OPERATOR thwokSpiffler(xfkj Merfle, pgl Zorg) RETURNS TUPLE {x INT, y INT, z INT} ...

Is it necessary?

No, you could define a type with a possrep having three elements, but tuples have an ad-hoc convenience that defining a new type often doesn't.

Don't you sometimes find tuples convenient in C# and Python?

They're not necessary there either, but they're handy.

Indeed, they make less sense in C# and Python than a D, given a D has relations composed of tuples (and a heading.) That makes tuples unavoidably semantically present, which means it's hard to argue a case that they should only appear wrapped in relations when they're obviously handy outside of relations too.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
PreviousPage 5 of 9Next