## The Forum for Discussion about The Third Manifesto and Related Matters

Forum breadcrumbs - You are here:

# Generalising Codd's operators; and a quiz

Quote from AntC on July 7, 2019, 11:37 am
Quote from AntC on July 7, 2019, 10:35 am
Quote from Dave Voorhis on July 7, 2019, 7:46 am
Quote from AntC on July 6, 2019, 11:36 pm
Quote from dandl on July 6, 2019, 5:10 pm
Quote from AntC on July 6, 2019, 6:52 am
Quote from dandl on July 6, 2019, 5:00 am
Quote from AntC on July 5, 2019, 11:30 am

...

Generalising Codd

Codd's original set of operators for the Relational Algebra are rather annoying: there's tiresome restrictions on the headings of operands; which makes them all partial; and means they lack nice algebraic properties.

In particular, `UNION, INTERSECT, MINUS` require their operands to have the same headings ('union-compatible') whereas `TIMES` requires its operands to have disjoint headings. As Appendix A points out, we can observe that `TIMES` is a special case (disjoint headings) of `JOIN`, and `INTERSECT` is also a special case (same headings); so we can dispense with both and generalise as `JOIN`, or `<AND>` as Appendix A uses.

...

Yes I've copied `InnerIntersection` from Tropashko (he calls it InnerJoin). His papers claim it's not associative, not distributive over `<OR>`, and in general has disappointing properties. I suspect he has his definitions wrong.

• ...

The common attributes approach works for `UNION` as `InnerUnion`. The surprise (to me) is that gives a dual to `JOIN` aka `<AND>`. And we can combine with the properties for `DUM` (as Hugh investigated), as the join-to-make-it-empty element to get

`x PROJECT_ON y =`df` x InnerUnion HEADING_OF(y);` // i.e. project x on the attributes in common with y

`x SAME_HEADING_AS y =`df` HEADING_OF(x) = HEADING_OF(y);`

`HEADING_OF(x) =`df` x JOIN DUM;`

Then we can represent headings as relation values and start doing set-algebra on headings.

I wonder if this has a real world application.

The obvious application is to implement RM VSS 6 generic relational operators. Currently in Tutorial D/ Rel there's no way to access something that denotes the heading of a relation or to do the "set-algebra on headings" I talked about. The only way to get a projection is by explicitly naming the wanted attributes as constants (or naming the unwanted attributes with `ALL BUT`).

We can't even in Rel define my `HEADING_OF( )` above, even though it uses only Tutorial D operators/values, because it's polymorphic.

Even without opening the Pandora's box of user-defined relational operators, if there were just a few of these Tropashko operators available in Rel, we could really start cooking. We'd need

• `HEADING_OF( )`;
• `InnerUnion`, giving my `PROJECT_ON` above;
• `PROJECT_AWAY` aka generic `REMOVE`: project away from the left operand the attributes in common with the right operand.

I'll give away `HEADING_OF(e)` because the longhand `(e) JOIN DUM` isn't much longer. I'll even give away `InnerUnion` providing I get both of

• `rx PROJECT_ON e`      //  project `rx` on attributes in common with `e` -- which might be anything from the whole heading to nothing in common
• `rx PROJECT_AWAY e`  // project `rx` on attributes not common with `e` -- which ditto
• in which `rx, e` are <relation exp>s, whose headings/types must be join compatible
• `e` need not be evaluated, indeed better if not: we only want to know its heading
• I can then express `r1 InnerUnion r2 = (r1 PROJECT_ON r2) UNION (r2 PROJECT_ON r1)`

Not really.

Quote from AntC on July 7, 2019, 11:37 am

OK good. And I like being able to use `ALL BUT`. I asked for a little more: can the `ATTRIBUTES_OF( )` return attributes that don't appear in the l.h. relation expression? If not, how to 'trim down' `ATTRIBUTES_OF(  )` appropriately? That's needed for example with the `IMAGE_IN( )` expression:

• `IMAGE_IN(r, t) = (r JOIN RELATION{ t }) {ALL BUT ATTRIBUTES_OF( t )}`  // `t` might have attributes not in `r`

BTW `ATTRIBUTES_OF( )` isn't documented in the Tut D approved variations, so how would I have known about it?

Re that last question re 'trim down', I can't go

• `IMAGE_IN(r, t) = (r JOIN RELATION{ t }) {ALL BUT ATTRIBUTES_OF( RELATION{ t } {ALL BUT ATTRIBUTES_OF( r )})}`

because `r` might have attributes not in `t`.

Sorry, I'm not really following what you mean by 'trim down'. ATTRIBUTES_OF(r) returns precisely the attribute list of r, so any trimming down has to be done in terms of some (relation or tuple) expression r.

As for how you would know about it... Take a look in the Rel manual.

Ah, yes, there isn't one. Indeed. I really need to do something about that.

At the time I announced the Rel release that included ATTRIBUTES_OF(...), I did have in mind writing a short article about it -- along with some other features that are otherwise undocumented except in the CHANGES.TXT file that accompanies each Rel release -- but the discussion subsequent to the announcement here quickly diverted into one about the seagull that appears on https://reldb.org1. Thus distracted, I forgot.

Anyway, I told you about it, so there's that.

--
1 A diversion that I heartily approve, by the way; nature/wildlife pictures and associated discussion are always good.

Quote from Dave Voorhis on July 7, 2019, 4:50 pm

As for how you would know about it... Take a look in the Rel manual.

Ah, yes, there isn't one. Indeed. I really need to do something about that.

At the time I announced the Rel release that included ATTRIBUTES_OF(...), I did have in mind writing a short article about it -- along with some other features that are otherwise undocumented except in the CHANGES.TXT file that accompanies each Rel release -- but the discussion subsequent to the announcement here quickly diverted into one about the seagull that appears on https://reldb.org1. Thus distracted, I forgot.

Yes, I remember that and yes, we've had this discussion before. Of course I would like a full manual triggered as context-sensitive help form within Rel, but I'd settle for a single page cheat sheet. How long can that take?

Andl - A New Database Language - andl.org
Quote from AntC on July 7, 2019, 11:00 am
• {{ r }}
• r1 {{ r1 and r2 }} union r2 {{ r1 and r2 }}
• r1 {{ r1 minus r2 }}

If I might be allowed unparliamentary language, there's a possibly-useful idea here, and a great deal of daftness. Why go using reserved words that are already relational operators to apply in exactly the opposite sense? (`union` of headings is the heading of `JOIN` of relations.) Furthermore there's already reserved words in Tutorial D to indicate the relative complement of a  heading: `ALL BUT`. The possibly-useful bit is to overload the trailing `{ }` syntax for projection.

I'll take that as a diplomatic suggestion for an alternative syntax. But (as per TTM) I am not proposing syntax, I am proposing that there be syntax (such as {{ }}) for specifying an attribute list by applying familiar operators (such as and or union) in a consistent way to previously defined attribute-bearing entities, to be resolved at compile time. You get to pick the specific syntax and operators to suit your taste.

• `r1 {ON r2}`           // project `r1` on attributes in common with `r2`; note this is not currently valid syntax, so no clashes
• `r1 {REMOVE r2}` // project `r1` on attributes not common with `r2` (I'd love to use `r1 {ALL BUT r2}`, but that would be syntactically ambiguous)
• in which `r2` is a relational expression, possibly itself using suffixed `{ }` for projection.

The first two are severely lacking in generality. The third is reasonable: write any relational/tuple expression you like, enclose it in {{ }} and it is resolved at compile time to a relational/tuple type (by means the compiler already has). Andl does this (but by different syntax).

Andl - A New Database Language - andl.org
Quote from Dave Voorhis on July 7, 2019, 4:50 pm
Quote from AntC on July 7, 2019, 11:37 am
Quote from AntC on July 7, 2019, 10:35 am
Quote from Dave Voorhis on July 7, 2019, 7:46 am
Quote from AntC on July 6, 2019, 11:36 pm
Quote from dandl on July 6, 2019, 5:10 pm
Quote from AntC on July 6, 2019, 6:52 am
Quote from dandl on July 6, 2019, 5:00 am
Quote from AntC on July 5, 2019, 11:30 am

...

Generalising Codd

Codd's original set of operators for the Relational Algebra are rather annoying: there's tiresome restrictions on the headings of operands; which makes them all partial; and means they lack nice algebraic properties.

In particular, `UNION, INTERSECT, MINUS` require their operands to have the same headings ('union-compatible') whereas `TIMES` requires its operands to have disjoint headings. As Appendix A points out, we can observe that `TIMES` is a special case (disjoint headings) of `JOIN`, and `INTERSECT` is also a special case (same headings); so we can dispense with both and generalise as `JOIN`, or `<AND>` as Appendix A uses.

...

Yes I've copied `InnerIntersection` from Tropashko (he calls it InnerJoin). His papers claim it's not associative, not distributive over `<OR>`, and in general has disappointing properties. I suspect he has his definitions wrong.

• ...

The common attributes approach works for `UNION` as `InnerUnion`. The surprise (to me) is that gives a dual to `JOIN` aka `<AND>`. And we can combine with the properties for `DUM` (as Hugh investigated), as the join-to-make-it-empty element to get

`x PROJECT_ON y =`df` x InnerUnion HEADING_OF(y);` // i.e. project x on the attributes in common with y

`x SAME_HEADING_AS y =`df` HEADING_OF(x) = HEADING_OF(y);`

`HEADING_OF(x) =`df` x JOIN DUM;`

Then we can represent headings as relation values and start doing set-algebra on headings.

I wonder if this has a real world application.

The obvious application is to implement RM VSS 6 generic relational operators. Currently in Tutorial D/ Rel there's no way to access something that denotes the heading of a relation or to do the "set-algebra on headings" I talked about. The only way to get a projection is by explicitly naming the wanted attributes as constants (or naming the unwanted attributes with `ALL BUT`).

We can't even in Rel define my `HEADING_OF( )` above, even though it uses only Tutorial D operators/values, because it's polymorphic.

Even without opening the Pandora's box of user-defined relational operators, if there were just a few of these Tropashko operators available in Rel, we could really start cooking. We'd need

• `HEADING_OF( )`;
• `InnerUnion`, giving my `PROJECT_ON` above;
• `PROJECT_AWAY` aka generic `REMOVE`: project away from the left operand the attributes in common with the right operand.

I'll give away `HEADING_OF(e)` because the longhand `(e) JOIN DUM` isn't much longer. I'll even give away `InnerUnion` providing I get both of

• `rx PROJECT_ON e`      //  project `rx` on attributes in common with `e` -- which might be anything from the whole heading to nothing in common
• `rx PROJECT_AWAY e`  // project `rx` on attributes not common with `e` -- which ditto
• in which `rx, e` are <relation exp>s, whose headings/types must be join compatible
• `e` need not be evaluated, indeed better if not: we only want to know its heading
• I can then express `r1 InnerUnion r2 = (r1 PROJECT_ON r2) UNION (r2 PROJECT_ON r1)`

Quote from AntC on July 7, 2019, 11:37 am

OK good. And I like being able to use `ALL BUT`. I asked for a little more: can the `ATTRIBUTES_OF( )` return attributes that don't appear in the l.h. relation expression? If not, how to 'trim down' `ATTRIBUTES_OF(  )` appropriately? That's needed for example with the `IMAGE_IN( )` expression:

• `IMAGE_IN(r, t) = (r JOIN RELATION{ t }) {ALL BUT ATTRIBUTES_OF( t )}`  // `t` might have attributes not in `r`

BTW `ATTRIBUTES_OF( )` isn't documented in the Tut D approved variations, so how would I have known about it?

Re that last question re 'trim down', I can't go

• `IMAGE_IN(r, t) = (r JOIN RELATION{ t }) {ALL BUT ATTRIBUTES_OF( RELATION{ t } {ALL BUT ATTRIBUTES_OF( r )})}`

because `r` might have attributes not in `t`.

Sorry, I'm not really following what you mean by 'trim down'.

I mean for example

`(S JOIN (P RENAME {PCITY := CITY})) {ATTRIBUTES_OF(SP)};`

• (The `RENAME` of `CITY` is merely to avoid a 'clash' of same-named attributes between `S, P`)
• `ATTRIBUTES_OF(SP)` includes attribute `QTY`, which doesn't appear in the `JOIN` result.
• Compare I can't AFAIK write `(S JOIN (P RENAME {PCITY := CITY}))``{S#, P#, QTY};`
• But I can write `(S JOIN (P RENAME {PCITY := CITY})) MATCHING SP;`

That `MATCHING` is what I want for generality: if the r.h. operand includes attributes that don't appear in the l.h. operand, just ignore/treat the project-away as a no-op.

ATTRIBUTES_OF(r) returns precisely the attribute list of r, so any trimming down has to be done in terms of some (relation or tuple) expression r.

My point being that in the absence of what I'm asking for, there's no way to 'trim away'/'trim off' attributes except by explicit attribute name constants. Then you can't write generic expressions.

... -- but the discussion subsequent to the announcement here quickly diverted into one about the seagull that appears on https://reldb.org1.

--
1 A diversion that I heartily approve, by the way; nature/wildlife pictures and associated discussion are always good.

No disputing that. I was accompanied by piwakawaka (NZ Fantails) for most of my tramp yesterday: they love trampers coming through the forest stirring up the insects for them to catch; and they're so cute flitting around you spreading their fantails to abruptly turn and snap up on the wing. (Similar to UK wagtails, but much more aerobatic.)

Quote from dandl on July 6, 2019, 5:10 pm
Quote from AntC on July 6, 2019, 6:52 am
Quote from dandl on July 6, 2019, 5:00 am
Quote from AntC on July 5, 2019, 11:30 am

...

Generalising Codd

Codd's original set of operators for the Relational Algebra are rather annoying: there's tiresome restrictions on the headings of operands; which makes them all partial; and means they lack nice algebraic properties.

In particular, `UNION, INTERSECT, MINUS` require their operands to have the same headings ('union-compatible') whereas `TIMES` requires its operands to have disjoint headings. As Appendix A points out, we can observe that `TIMES` is a special case (disjoint headings) of `JOIN`, and `INTERSECT` is also a special case (same headings); so we can dispense with both and generalise as `JOIN`, or `<AND>` as Appendix A uses.

...

Yes I've copied `InnerIntersection` from Tropashko (he calls it InnerJoin). His papers claim it's not associative, not distributive over `<OR>`, and in general has disappointing properties. I suspect he has his definitions wrong.

It's an interesting question, but I'm surprised it causes such difficulty. I've always been of the view that, despite Codd's restrictive definitions, all the set-like operations (UNION, INTERSECT, MINUS, etc) generalise simply:

• project each argument onto the set of common attributes
• apply Codd-like operation.

`MINUS` (with union-compatible operands) is not commutative nor idempotent nor associative. It has neither left nor right identity. No amount of generalising will give it nice algebraic properties; and projecting each argument on to the set of common attributes won't improve the situation. So I'm happy to generalise to `NOT MATCHING` (whose result has heading same as its left operand, and which has right identity `DUM`) and leave it at that.

The common attributes approach works for `UNION` as `InnerUnion`. The surprise (to me) is that gives a dual to `JOIN` aka `<AND>`. And we can combine with the properties for `DUM` (as Hugh investigated), as the join-to-make-it-empty element to get

`x PROJECT_ON y =`df` x InnerUnion HEADING_OF(y);` // i.e. project x on the attributes in common with y

`x SAME_HEADING_AS y =`df` HEADING_OF(x) = HEADING_OF(y);`

`HEADING_OF(x) =`df` x JOIN DUM;`

Then we can represent headings as relation values and start doing set-algebra on headings.

That certainly looks associative, and I can see no way to construct a counter-example. Does that not suggest a method for a proof?

For `INTERSECT` project on to common attributes, looks can be deceptive it seems. And Tegiri has given a counter-example. So no wonder there's no proof. More of a concern is that Mace4 can't find a counter-example (perhaps I didn't run it for long enough, but it should only need a 6-element model; small beer by its lights).

Indeed. Informally, intersection before projection can remove tuples that would survive if projection goes first. Seems obvious given the counter-example.

I wonder if this has a real world application.

More formally we could say that associativity of `InnerUnion` is not valid because it would need push-project-through-join (or lift-project-outside-join), which is not valid where the projected-away attributes include those in common between the operands of the join. To express that we need a way to formulate 'projected-away attributes' as smoothly as 'projected-on attributes'.

Now Appendix A does that with its `<REMOVE>`, but smooth it isn't. There's a great deal of weasel words and hand-waving (like the talk of scare-quotes "macro" operator).

Then one real world application is: to get relational expressions more like algebra; so that the programmer can reason about their code, and chuck it at theorem provers to verify their intuitions; and so that optimisers can rearrange the code being confident they are not corrupting its semantics. In fact how Functional Programmers tend to operate anyway.

Quote from AntC on July 8, 2019, 3:53 am
Quote from dandl on July 6, 2019, 5:10 pm

Indeed. Informally, intersection before projection can remove tuples that would survive if projection goes first. Seems obvious given the counter-example.

I wonder if this has a real world application.

More formally we could say that associativity of `InnerUnion` is not valid because it would need push-project-through-join (or lift-project-outside-join), which is not valid where the projected-away attributes include those in common between the operands of the join. To express that we need a way to formulate 'projected-away attributes' as smoothly as 'projected-on attributes'.

You said that very nicely. I wonder why you didn't see it that way before being provided with that counter example. No matter.

Now Appendix A does that with its `<REMOVE>`, but smooth it isn't. There's a great deal of weasel words and hand-waving (like the talk of scare-quotes "macro" operator).

Then one real world application is: to get relational expressions more like algebra; so that the programmer can reason about their code, and chuck it at theorem provers to verify their intuitions; and so that optimisers can rearrange the code being confident they are not corrupting its semantics. In fact how Functional Programmers tend to operate anyway.

A worthy goal. I don't find RA expressions hard to reason about, as long as the syntax is natural and flows nicely. But all I'm trying to do is find some reasonable path to get from point A to point B. It's way harder to refactor code, or choose the best query plan. For that you really need rock solid underlying theory. You need a set of provable rewrite rules. Is that your aim? Do you expect to achieve it?

Andl - A New Database Language - andl.org
Quote from dandl on July 8, 2019, 12:38 am
Quote from Dave Voorhis on July 7, 2019, 4:50 pm

As for how you would know about it... Take a look in the Rel manual.

Ah, yes, there isn't one. Indeed. I really need to do something about that.

At the time I announced the Rel release that included ATTRIBUTES_OF(...), I did have in mind writing a short article about it -- along with some other features that are otherwise undocumented except in the CHANGES.TXT file that accompanies each Rel release -- but the discussion subsequent to the announcement here quickly diverted into one about the seagull that appears on https://reldb.org1. Thus distracted, I forgot.

Yes, I remember that and yes, we've had this discussion before. Of course I would like a full manual triggered as context-sensitive help form within Rel, but I'd settle for a single page cheat sheet. How long can that take?

The last time we had this discussion, I made one in -- literally -- a few minutes. See https://reldb.org/c/wp-content/uploads/2017/12/Rel-and-Tutorial-D-Quickstart.pdf

Quote from AntC on July 8, 2019, 3:39 am
Quote from Dave Voorhis on July 7, 2019, 4:50 pm
Quote from AntC on July 7, 2019, 11:37 am
Quote from AntC on July 7, 2019, 10:35 am
Quote from Dave Voorhis on July 7, 2019, 7:46 am
Quote from AntC on July 6, 2019, 11:36 pm
Quote from dandl on July 6, 2019, 5:10 pm
Quote from AntC on July 6, 2019, 6:52 am
Quote from dandl on July 6, 2019, 5:00 am
Quote from AntC on July 5, 2019, 11:30 am

...

Generalising Codd

Codd's original set of operators for the Relational Algebra are rather annoying: there's tiresome restrictions on the headings of operands; which makes them all partial; and means they lack nice algebraic properties.

In particular, `UNION, INTERSECT, MINUS` require their operands to have the same headings ('union-compatible') whereas `TIMES` requires its operands to have disjoint headings. As Appendix A points out, we can observe that `TIMES` is a special case (disjoint headings) of `JOIN`, and `INTERSECT` is also a special case (same headings); so we can dispense with both and generalise as `JOIN`, or `<AND>` as Appendix A uses.

...

Yes I've copied `InnerIntersection` from Tropashko (he calls it InnerJoin). His papers claim it's not associative, not distributive over `<OR>`, and in general has disappointing properties. I suspect he has his definitions wrong.

• ...

The common attributes approach works for `UNION` as `InnerUnion`. The surprise (to me) is that gives a dual to `JOIN` aka `<AND>`. And we can combine with the properties for `DUM` (as Hugh investigated), as the join-to-make-it-empty element to get

`x PROJECT_ON y =`df` x InnerUnion HEADING_OF(y);` // i.e. project x on the attributes in common with y

`x SAME_HEADING_AS y =`df` HEADING_OF(x) = HEADING_OF(y);`

`HEADING_OF(x) =`df` x JOIN DUM;`

Then we can represent headings as relation values and start doing set-algebra on headings.

I wonder if this has a real world application.

The obvious application is to implement RM VSS 6 generic relational operators. Currently in Tutorial D/ Rel there's no way to access something that denotes the heading of a relation or to do the "set-algebra on headings" I talked about. The only way to get a projection is by explicitly naming the wanted attributes as constants (or naming the unwanted attributes with `ALL BUT`).

We can't even in Rel define my `HEADING_OF( )` above, even though it uses only Tutorial D operators/values, because it's polymorphic.

Even without opening the Pandora's box of user-defined relational operators, if there were just a few of these Tropashko operators available in Rel, we could really start cooking. We'd need

• `HEADING_OF( )`;
• `InnerUnion`, giving my `PROJECT_ON` above;
• `PROJECT_AWAY` aka generic `REMOVE`: project away from the left operand the attributes in common with the right operand.

I'll give away `HEADING_OF(e)` because the longhand `(e) JOIN DUM` isn't much longer. I'll even give away `InnerUnion` providing I get both of

• `rx PROJECT_ON e`      //  project `rx` on attributes in common with `e` -- which might be anything from the whole heading to nothing in common
• `rx PROJECT_AWAY e`  // project `rx` on attributes not common with `e` -- which ditto
• in which `rx, e` are <relation exp>s, whose headings/types must be join compatible
• `e` need not be evaluated, indeed better if not: we only want to know its heading
• I can then express `r1 InnerUnion r2 = (r1 PROJECT_ON r2) UNION (r2 PROJECT_ON r1)`

Quote from AntC on July 7, 2019, 11:37 am

OK good. And I like being able to use `ALL BUT`. I asked for a little more: can the `ATTRIBUTES_OF( )` return attributes that don't appear in the l.h. relation expression? If not, how to 'trim down' `ATTRIBUTES_OF(  )` appropriately? That's needed for example with the `IMAGE_IN( )` expression:

• `IMAGE_IN(r, t) = (r JOIN RELATION{ t }) {ALL BUT ATTRIBUTES_OF( t )}`  // `t` might have attributes not in `r`

BTW `ATTRIBUTES_OF( )` isn't documented in the Tut D approved variations, so how would I have known about it?

Re that last question re 'trim down', I can't go

• `IMAGE_IN(r, t) = (r JOIN RELATION{ t }) {ALL BUT ATTRIBUTES_OF( RELATION{ t } {ALL BUT ATTRIBUTES_OF( r )})}`

because `r` might have attributes not in `t`.

Sorry, I'm not really following what you mean by 'trim down'.

I mean for example

`(S JOIN (P RENAME {PCITY := CITY})) {ATTRIBUTES_OF(SP)};`

• (The `RENAME` of `CITY` is merely to avoid a 'clash' of same-named attributes between `S, P`)
• `ATTRIBUTES_OF(SP)` includes attribute `QTY`, which doesn't appear in the `JOIN` result.
• Compare I can't AFAIK write `(S JOIN (P RENAME {PCITY := CITY}))``{S#, P#, QTY};`
• But I can write `(S JOIN (P RENAME {PCITY := CITY})) MATCHING SP;`

That `MATCHING` is what I want for generality: if the r.h. operand includes attributes that don't appear in the l.h. operand, just ignore/treat the project-away as a no-op.

ATTRIBUTES_OF(r) returns precisely the attribute list of r, so any trimming down has to be done in terms of some (relation or tuple) expression r.

My point being that in the absence of what I'm asking for, there's no way to 'trim away'/'trim off' attributes except by explicit attribute name constants. Then you can't write generic expressions.

Ah, got it!

How about ATTRIBUTES_OF(r, p), which returns the list of attributes in r found in (relation or tuple) expression p?

Quote from Dave Voorhis on July 8, 2019, 8:29 am
Quote from AntC on July 8, 2019, 3:39 am
Quote from Dave Voorhis on July 7, 2019, 4:50 pm
Quote from AntC on July 7, 2019, 11:37 am
Quote from AntC on July 7, 2019, 10:35 am
Quote from Dave Voorhis on July 7, 2019, 7:46 am
Quote from AntC on July 6, 2019, 11:36 pm
Quote from dandl on July 6, 2019, 5:10 pm
Quote from AntC on July 6, 2019, 6:52 am
Quote from dandl on July 6, 2019, 5:00 am
Quote from AntC on July 5, 2019, 11:30 am

...

Generalising Codd

Codd's original set of operators for the Relational Algebra are rather annoying: there's tiresome restrictions on the headings of operands; which makes them all partial; and means they lack nice algebraic properties.

In particular, `UNION, INTERSECT, MINUS` require their operands to have the same headings ('union-compatible') whereas `TIMES` requires its operands to have disjoint headings. As Appendix A points out, we can observe that `TIMES` is a special case (disjoint headings) of `JOIN`, and `INTERSECT` is also a special case (same headings); so we can dispense with both and generalise as `JOIN`, or `<AND>` as Appendix A uses.

...

Yes I've copied `InnerIntersection` from Tropashko (he calls it InnerJoin). His papers claim it's not associative, not distributive over `<OR>`, and in general has disappointing properties. I suspect he has his definitions wrong.

• ...

The common attributes approach works for `UNION` as `InnerUnion`. The surprise (to me) is that gives a dual to `JOIN` aka `<AND>`. And we can combine with the properties for `DUM` (as Hugh investigated), as the join-to-make-it-empty element to get

`x PROJECT_ON y =`df` x InnerUnion HEADING_OF(y);` // i.e. project x on the attributes in common with y

`x SAME_HEADING_AS y =`df` HEADING_OF(x) = HEADING_OF(y);`

`HEADING_OF(x) =`df` x JOIN DUM;`

Then we can represent headings as relation values and start doing set-algebra on headings.

I wonder if this has a real world application.

The obvious application is to implement RM VSS 6 generic relational operators. Currently in Tutorial D/ Rel there's no way to access something that denotes the heading of a relation or to do the "set-algebra on headings" I talked about. The only way to get a projection is by explicitly naming the wanted attributes as constants (or naming the unwanted attributes with `ALL BUT`).

We can't even in Rel define my `HEADING_OF( )` above, even though it uses only Tutorial D operators/values, because it's polymorphic.

Even without opening the Pandora's box of user-defined relational operators, if there were just a few of these Tropashko operators available in Rel, we could really start cooking. We'd need

• `HEADING_OF( )`;
• `InnerUnion`, giving my `PROJECT_ON` above;
• `PROJECT_AWAY` aka generic `REMOVE`: project away from the left operand the attributes in common with the right operand.

I'll give away `HEADING_OF(e)` because the longhand `(e) JOIN DUM` isn't much longer. I'll even give away `InnerUnion` providing I get both of

• `rx PROJECT_ON e`      //  project `rx` on attributes in common with `e` -- which might be anything from the whole heading to nothing in common
• `rx PROJECT_AWAY e`  // project `rx` on attributes not common with `e` -- which ditto
• in which `rx, e` are <relation exp>s, whose headings/types must be join compatible
• `e` need not be evaluated, indeed better if not: we only want to know its heading
• I can then express `r1 InnerUnion r2 = (r1 PROJECT_ON r2) UNION (r2 PROJECT_ON r1)`

Quote from AntC on July 7, 2019, 11:37 am

OK good. And I like being able to use `ALL BUT`. I asked for a little more: can the `ATTRIBUTES_OF( )` return attributes that don't appear in the l.h. relation expression? If not, how to 'trim down' `ATTRIBUTES_OF(  )` appropriately? That's needed for example with the `IMAGE_IN( )` expression:

• `IMAGE_IN(r, t) = (r JOIN RELATION{ t }) {ALL BUT ATTRIBUTES_OF( t )}`  // `t` might have attributes not in `r`

BTW `ATTRIBUTES_OF( )` isn't documented in the Tut D approved variations, so how would I have known about it?

Re that last question re 'trim down', I can't go

• `IMAGE_IN(r, t) = (r JOIN RELATION{ t }) {ALL BUT ATTRIBUTES_OF( RELATION{ t } {ALL BUT ATTRIBUTES_OF( r )})}`

because `r` might have attributes not in `t`.

Sorry, I'm not really following what you mean by 'trim down'.

I mean for example

`(S JOIN (P RENAME {PCITY := CITY})) {ATTRIBUTES_OF(SP)};`

• (The `RENAME` of `CITY` is merely to avoid a 'clash' of same-named attributes between `S, P`)
• `ATTRIBUTES_OF(SP)` includes attribute `QTY`, which doesn't appear in the `JOIN` result.
• Compare I can't AFAIK write `(S JOIN (P RENAME {PCITY := CITY}))``{S#, P#, QTY};`
• But I can write `(S JOIN (P RENAME {PCITY := CITY})) MATCHING SP;`

That `MATCHING` is what I want for generality: if the r.h. operand includes attributes that don't appear in the l.h. operand, just ignore/treat the project-away as a no-op.

ATTRIBUTES_OF(r) returns precisely the attribute list of r, so any trimming down has to be done in terms of some (relation or tuple) expression r.

My point being that in the absence of what I'm asking for, there's no way to 'trim away'/'trim off' attributes except by explicit attribute name constants. Then you can't write generic expressions.

Ah, got it!

How about ATTRIBUTES_OF(r, p), which returns the list of attributes in r found in (relation or tuple) expression p?

Then I'd be writing one of

• `(S JOIN (P RENAME {PCITY := CITY})) {ATTRIBUTES_OF(SP, (S JOIN (P RENAME {PCITY := CITY})))};`
• `(S JOIN (P RENAME {PCITY := CITY})) {ATTRIBUTES_OF((S JOIN (P RENAME {PCITY := CITY})), SP)};`

Fits my definition of "excessive circumlocution". Whereas if you're using `ATTRIBUTES_OF( )` within `{[ALL BUT] ...}` for a project/remove, you're necessarily limited to the attributes in the outer/l.h. operand. I guess there could be a shorthand along the lines of `TUPLE{*}`:

• `(S JOIN (P RENAME {PCITY := CITY})) {ATTRIBUTES_OF(SP, *)};`

Even that seems excessive: why would I put anything other than `*`? If `p`'s attributes are a subset of `r`'s, I could just go

• `ATTRIBUTES_OF( r{ ATTRIBUTES_OF( p ) } )`

Quote from AntC on July 8, 2019, 9:18 am
Quote from Dave Voorhis on July 8, 2019, 8:29 am

How about ATTRIBUTES_OF(r, p), which returns the list of attributes in r found in (relation or tuple) expression p?

Then I'd be writing one of

• `(S JOIN (P RENAME {PCITY := CITY})) {ATTRIBUTES_OF(SP, (S JOIN (P RENAME {PCITY := CITY})))};`
• `(S JOIN (P RENAME {PCITY := CITY})) {ATTRIBUTES_OF((S JOIN (P RENAME {PCITY := CITY})), SP)};`

Fits my definition of "excessive circumlocution". Whereas if you're using `ATTRIBUTES_OF( )` within `{[ALL BUT] ...}` for a project/remove, you're necessarily limited to the attributes in the outer/l.h. operand. I guess there could be a shorthand along the lines of `TUPLE{*}`:

• `(S JOIN (P RENAME {PCITY := CITY})) {ATTRIBUTES_OF(SP, *)};`

Even that seems excessive: why would I put anything other than `*`? If `p`'s attributes are a subset of `r`'s, I could just go

• `ATTRIBUTES_OF( r{ ATTRIBUTES_OF( p ) } )`

Could you use WITH to eliminate duplicated subexpressions?

I was thinking of ATTRIBUTES_OF(rp) being available in addition to ATTRIBUTES_OF(r).