# ANNOUNCE: a relational algebra with only one primitive operator (tentative) (please check/critique)

Quote from AntC on May 13, 2021, 11:46 pmThis definition for

`NotMatching`

is defective:

`r1 NatJoin s = s & s NatJoin (r1 NotMatching r2) = (r1 NotMatching r2) ≡ (r1 NatJoin r2) InnerUnion s = r1`

.- In the case where
`r1, r2`

are the same value, the rhs of the`≡`

becomes`(r1 NatJoin r1) InnerUnion s = r1`

, reduces to`r1 InnerUnion s = r1`

(Idempotence of`NatJoin`

).- The lhs of
`≡`

wants`NotMatching`

to return the minimal value that satisfies rhs. That'll be lattice bottom. For every possible`r1`

.- So my chief purpose for using
`NotMatching`

was to obtain`emptify(r) = r NotMatching r`

as a way to talk about headings: all`r2`

s with`emptify(r2) = emptify(r)`

have the same heading. But with that definition for`emptify( )`

, all relations of whatever heading return the same (bottom) value.- It ain't going to work. And that's why I wasn't making progress with defining
`DUM`

:`DEE NotMatching DEE`

still returns bottom.

This definition for `NotMatching`

is defective:

`r1 NatJoin s = s & s NatJoin (r1 NotMatching r2) = (r1 NotMatching r2) ≡ (r1 NatJoin r2) InnerUnion s = r1`

.- In the case where
`r1, r2`

are the same value, the rhs of the`≡`

becomes`(r1 NatJoin r1) InnerUnion s = r1`

, reduces to`r1 InnerUnion s = r1`

(Idempotence of`NatJoin`

). - The lhs of
`≡`

wants`NotMatching`

to return the minimal value that satisfies rhs. That'll be lattice bottom. For every possible`r1`

. - So my chief purpose for using
`NotMatching`

was to obtain`emptify(r) = r NotMatching r`

as a way to talk about headings: all`r2`

s with`emptify(r2) = emptify(r)`

have the same heading. But with that definition for`emptify( )`

, all relations of whatever heading return the same (bottom) value. - It ain't going to work. And that's why I wasn't making progress with defining
`DUM`

:`DEE NotMatching DEE`

still returns bottom.

Quote from AntC on May 14, 2021, 12:36 amQuote from Vadim on May 13, 2021, 4:27 pmQuote from AntC on May 13, 2021, 11:53 amQuote from Vadim on May 12, 2021, 5:28 pmConsider a database with all relations having the same set of attributes. Then, it is a Boolean Algebra.

Since all the relation values with the same heading form a Boolean Algebra (i.e. a distributive, complemented sub-lattice within the bigger structure), we can take any element

`r1`

within it: the sub-sub-lattice from that element (as sub-sub-lattice top) 'down' to the empty relation with that heading is also a Boolean Algebra. That empty relation value is given by`r1 NotMatching r1`

.Sub-Sub-Lattice complement of

`r2`

coincides with`r1 NotMatching r2`

. IOW if`r2 NatJoin s = (r1 NotMatching r1) & r2 InnerUnion s = r1`

then`s = r1 NotMatching r2`

. I've added that as an axiom. Also added the distributivity within the sub-sub-lattice from`r1`

. That still doesn't fix enough, without the 'ugly' axiom.The simplest possible counter example is the database consisting of only two relations: DEE and DUM. Yes, this example is far fetched from database practice, but counterexamples can be trivial/silly.

The lattice structure containing only

`DEE, DUM`

is a perfectly fine structure of relation values. And it is a (degenerate) Boolean Algebra. And my proposed definitions work very smoothly. (`DUM`

coincides with lattice bottom.)The other [rather artificial] example is the database with all the empty relations.

No that's not a valid lattice structure to model relations: we must always have

`DEE`

, no matter how many empty relations there are. We can always say the sub-lattice from`DUM`

down to lattice bottom must be a Boolean algebra (as the powerset of all possible attribute-type pairs, formed into headings). That is, if we can define`DUM`

.(As an aside: I could see that a structure containing only

`DEE`

and empty relations would be problematic to model. Furthermore a structure containing only singleton and empty relations would be problematic -- that is, all attribute types containing only a single possible value. So if I could model a structure in which for each non-empty heading there's at least two possible relation values, I'd be happy enough. I'd even (reluctantly) give away the possibility of a model with only`DEE, DUM`

if I could get the rest to work.)Therefore, we are in the realm of Boolean Algebra. There, reduction to a single operation is well known (NAND, NOR), which is not consistent with your presentation.

Aside from the trivial

`DEE, DUM`

structure, no required structures are Boolean Algebras alone. So thank you for the proof that`NAND, NOR`

alone are not adequate.In general, a reduction of the operation set in Universal Algebra to some basic set is understood as representing each derived operation via

composition of basic operatons.Please give a citation for that claim. I think you just made it up. Appendix A (tries to) proceed in that way. But actually fails: for example it does not reduce

`RENAME`

to other basic operations -- the other operations still need`RENAME`

or some further mechanism (not specified), as Philip Kelly pointed out many moons ago -- he proposed a mechanism 'equi-relations'.In other words, you are not allowed to use the First Order Logic sentenses and specify a constraint on the derived operation this way; you have to exhibit an equality with the derived operation alone on the left hand side.

No. You are just making it up. I am not trying to produce an 'executable' algebra. I'm aiming for a 'solvable' algebra -- in the sense a theorem prover (or human) can infer if two queries are equivalent (will produce the same result over all possible input values, providing initial assumptions hold -- the algebra must be powerful enough to express those assumptions).

I'm not obsessing about a 'party game' of minimising the count of primitives. If we have many primitives, we still have to provide equations to define how they interact, so we can manipulate query equations to prove equivalences. This is just schoolroom solving of 'simultaneous equations'. In conventional lattice treatments, the interaction of

`meet, join`

is defined via the absorption laws. I merely equivalently defined`InnerUnion`

(lattice`join`

) in terms of`NatJoin`

(lattice`meet`

), to demonstrate the technique before using it to 'fix'`NotMatching`

-- or so I hoped. From my definition of`InnerUnion`

you can mechanically derive the absorption laws, and all the properties of`InnerUnion`

.For example I can give a quadratic equation as

`ax`

^{2}`+ bx + c = 0`

, no "derived operation alone on the left hand side" there (or on either side), and from that derive the solution for`x`

via equational reasoning.

Quote from Vadim on May 13, 2021, 4:27 pmQuote from AntC on May 13, 2021, 11:53 amQuote from Vadim on May 12, 2021, 5:28 pmConsider a database with all relations having the same set of attributes. Then, it is a Boolean Algebra.

Since all the relation values with the same heading form a Boolean Algebra (i.e. a distributive, complemented sub-lattice within the bigger structure), we can take any element

`r1`

within it: the sub-sub-lattice from that element (as sub-sub-lattice top) 'down' to the empty relation with that heading is also a Boolean Algebra. That empty relation value is given by`r1 NotMatching r1`

.Sub-Sub-Lattice complement of

`r2`

coincides with`r1 NotMatching r2`

. IOW if`r2 NatJoin s = (r1 NotMatching r1) & r2 InnerUnion s = r1`

then`s = r1 NotMatching r2`

. I've added that as an axiom. Also added the distributivity within the sub-sub-lattice from`r1`

. That still doesn't fix enough, without the 'ugly' axiom.The simplest possible counter example is the database consisting of only two relations: DEE and DUM. Yes, this example is far fetched from database practice, but counterexamples can be trivial/silly.

The lattice structure containing only `DEE, DUM`

is a perfectly fine structure of relation values. And it is a (degenerate) Boolean Algebra. And my proposed definitions work very smoothly. (`DUM`

coincides with lattice bottom.)

The other [rather artificial] example is the database with all the empty relations.

No that's not a valid lattice structure to model relations: we must always have `DEE`

, no matter how many empty relations there are. We can always say the sub-lattice from `DUM`

down to lattice bottom must be a Boolean algebra (as the powerset of all possible attribute-type pairs, formed into headings). That is, if we can define `DUM`

.

(As an aside: I could see that a structure containing only `DEE`

and empty relations would be problematic to model. Furthermore a structure containing only singleton and empty relations would be problematic -- that is, all attribute types containing only a single possible value. So if I could model a structure in which for each non-empty heading there's at least two possible relation values, I'd be happy enough. I'd even (reluctantly) give away the possibility of a model with only `DEE, DUM`

if I could get the rest to work.)

Therefore, we are in the realm of Boolean Algebra. There, reduction to a single operation is well known (NAND, NOR), which is not consistent with your presentation.

Aside from the trivial `DEE, DUM`

structure, no required structures are Boolean Algebras alone. So thank you for the proof that `NAND, NOR`

alone are not adequate.

In general, a reduction of the operation set in Universal Algebra to some basic set is understood as representing each derived operation via

composition of basic operatons.

Please give a citation for that claim. I think you just made it up. Appendix A (tries to) proceed in that way. But actually fails: for example it does not reduce `RENAME`

to other basic operations -- the other operations still need `RENAME`

or some further mechanism (not specified), as Philip Kelly pointed out many moons ago -- he proposed a mechanism 'equi-relations'.

In other words, you are not allowed to use the First Order Logic sentenses and specify a constraint on the derived operation this way; you have to exhibit an equality with the derived operation alone on the left hand side.

No. You are just making it up. I am not trying to produce an 'executable' algebra. I'm aiming for a 'solvable' algebra -- in the sense a theorem prover (or human) can infer if two queries are equivalent (will produce the same result over all possible input values, providing initial assumptions hold -- the algebra must be powerful enough to express those assumptions).

I'm not obsessing about a 'party game' of minimising the count of primitives. If we have many primitives, we still have to provide equations to define how they interact, so we can manipulate query equations to prove equivalences. This is just schoolroom solving of 'simultaneous equations'. In conventional lattice treatments, the interaction of `meet, join`

is defined via the absorption laws. I merely equivalently defined `InnerUnion`

(lattice `join`

) in terms of `NatJoin`

(lattice `meet`

), to demonstrate the technique before using it to 'fix' `NotMatching`

-- or so I hoped. From my definition of `InnerUnion`

you can mechanically derive the absorption laws, and all the properties of `InnerUnion`

.

For example I can give a quadratic equation as `ax`

^{2}` + bx + c = 0`

, no "derived operation alone on the left hand side" there (or on either side), and from that derive the solution for `x`

via equational reasoning.

Quote from AntC on May 14, 2021, 3:42 amQuote from dandl on May 12, 2021, 1:41 pm...No, I really don't follow. I don't know whether it's the lattices or the toolset but you lost me here. As I see it REMOVE is just AntiJoin.

No AntiJoin (by which I take it you mean AntiSemiJoin) returns a relation with heading same as its left operand. It doesn't remove (nor add) any attributes.

Easy if you start with negation.

I'm not aiming for 'easy'; I'm aiming for expressively complete. Neither am I aiming for a 'party game' of reducing to a single operator (see my recent post). But if there's more than one operator, we need to define how they interact -- and in an equationally complete way. And if you get to a sufficiently robust set of definitions, you find you could anyway do it with only one operator -- that is, providing your domain has a single 'sort'. In contrast,

Ahas at least two 'sort's: relations and attribute-comma-lists. Now you need operators to get from a relation to the a-c-l for its heading and back.

Quote from dandl on May 13, 2021, 8:41 am## Only one primitive operator

`InnerUnion`

has some very nice properties for equational reasoning: it is commutative, idempotent, associative -- indeed as is`NatJoin`

. Unfortunately,`InnerUnion`

is not distributive over`NatJoin`

, norvice versa.- But (ANNOUNCE) if we could define
`InnerUnion`

in terms of`NatJoin`

(embedding the definition in the machinery of FOPL with equality)thenwe can reason around equations using both operators.- Furthermore (further ANNOUNCE) if we could define
`PROJECT/REMOVE`

,`NotMatching`

in terms of`NatJoin`

(embedding etc) then we can reason all the way round equations.At this juncture, I'd better point out that this is not intended as an 'executable' algebra any more than

Ais. And it would be horribly inefficient to go about`emptify`

ing relations just to get their heading -- or even to evaluate the relational expression argument to`emptify( )`

! It's an algebra purely for reasoning equationally about the equivalence of queries.No problem down to here, but I see trouble ahead. I was expecting the one operator to be a form of negation (like NAND).

We don't need our initial set of operators to include some form of negation. Consider a definition for dyadic operator arithmetic minus (

`-`

), or more relevantly set difference (`\`

):

`x - y =df z s.t. y + z = x`

.`x \ y =df z s.t. (y ∩ x) ∪ z = x & y ∩ z = ∅`

.- Note the definition for set difference needs some tricky conditions because of the possibility
`y`

contains elements not in`x`

; and we must forbid`z`

from containing elements already in`y`

.- Perhaps you're puzzled by the format I've presented definitions in my o.p.? rather than that
`z s.t. ...`

form, I could replace`z`

by its definiendum (substitution of equals) so go

`y + (x - y) = x`

`(y ∩ x) ∪ (x \ y) = x & y ∩ (x \ y) = ∅`

And then I could split that last conjunction into two separate axioms.This fixes

`-`

and`\`

uniquely for all`x, y`

. There's no need to go via monadic minus; indeed we could define`negate x =df 0 - x ≡ z s.t. x + z = 0`

. Because`0`

is identity for`+`

-- just as`DEE`

is identity for`NatJoin`

.Appendix A claims

`NAND`

could replace only`<AND>, <OR>, <NOT>`

; it doesn't go near project/`REMOVE, COMPOSE`

. So that's a non-starter. (Also it would introduce domain-dependence, which gives me a queasy feeling.)It says:

"In fact, we will show in the next section that we do not really

need RENAME either; thus, we could in fact reduce our algebra

still further to just the two operators REMOVE and either

NOR or NAND (plus TCLOSE)."I don't know how to break this to you, but parts of Appendix A are

wrong-- or at least their claims are greatly overstated. This doesn't take away from Appendix A being a great teaching resource to lead learners away from SQL's baroque set of operators and operations to the chill mountain stream of a minimalist algebra.wrt

`REMOVE`

, Appendix A is hand-wavey about how to get from a projection inTutorial Dto an equivalent`ALL BUT`

to`REMOVE`

. It is also hand-wavey about how to get from`COMPOSE`

to the needed`REMOVE`

: how does the algebra calculate the attributes in common between two operands being`COMPOSE`

d? (IOW what is a "macro" operator in this sense? That goes back to your extended discussion about macros on another thread, and at least needs some sort of program reflection to runafterthe compiler has inferred the heading of the expressions that are operands to`COMPOSE`

.)Project/Remove is simply InnerUnion with an empty relation representing the new heading.

Project is that.

`REMOVE`

is not. I did provide an equivalent definition of project as a`REMOVE`

of a`REMOVE`

, but that's not sufficient to 'fix' either without some extra machinery to talk about headings-as-sets. Also, as it turns out, I haven't yet fixed how to get an empty relation. (`(r NAND r)`

would deliver that, but then how do I integrate`NAND`

into the lattice operations? Or perhaps there's another approach which isn't at all lattice-based? You're welcome to try. Making the overall set of operations 'solvable' enough to express side-conditions and infer equivalence of queries raises a very high bar. Appendix A doesn't reach it.)...

My attempts at such definitions have foundered in the past because they relied on

`DUM`

being defined; and I couldn't obtain a definition for`DUM`

that fixed it uniquely in the structure of relations. (Reminder: we can use in the definitions only relation values unanalysed, and operators on them. There's no ability to manipulate 'headings' or 'bodies' (as sets), because that would introduce additional 'sorts' into the algebra.)The definition for

`DEE`

hadn't suffered that difficulty, because we can fix`DEE`

as the identity element for`NatJoin`

. And it turns out (claim not yet demonstrated) we can define the other operators without mentioning`DUM`

; then define`DUM`

from them:

`DEE =df s s.t. ∀ r. s NatJoin r = r`

.`DUM =df DEE NotMatching DEE`

.`DUM =df ∀ r. (r NotMatching r) REMOVE r`

equivalently`∀ r. (r REMOVE r) NotMatching r`

.- (Yes, that's three equivalent definitions for
`DUM`

-- proven equivalent by equational reasoning.)Seems to me it shouldn't be too hard to Dee and Dum, but only if you can negate.

`DEE`

is easy/doesn't need negate.`DUM = DEE NotMatching DEE`

turns out to have problems/is not unique enough, as I discovered. Lattice algebras tend to presume the structure is distributive, complemented unless you give a strong instruction otherwise -- which is my 'ugly' axiom.I have no idea what this means, or why I should care. My aim was to point out that I don't see any solution with a single primitive unless it's some kind of negation, like NAND. Whether or not you get a 'nice' DUM or not this way, I think you won't get there any other way.

`...`

No, I really don't follow. I don't know whether it's the lattices or the toolset but you lost me here. As I see it REMOVE is just AntiJoin. Easy if you start with negation.

AntiJoin could be defined as the negation of

`MATCHING`

; but`r1 MATCHING r2 = r1 NatJoin r2 {attributes-from-r1}`

, so you still need projectionAntiJoin is cognate with Not Matching: same heading, result is the complement of SemiJoin (aka Matching). I prefer to avoid the distinctive/idiosyncratic TD names for operators, they come with too much baggage.

I prefer to avoid the SQL names, because they come with too much baggage -- especially around

`Null`

and duplicate rows. If forced, I prefer 'AntiSemiJoin' to 'AntiJoin' (because the 'Anti' in 'AntiJoin' could be anti- some of the other behaviour of Join). TheTTM/Tutorial Dnames I find just perfect. I'm using`NatJoin`

rather than bare`JOIN`

, only because we're in latticeland, and`NatJoin`

corresponds to lattice`meet`

.But I have to say I'm not optimistic about your quest. In highly informal terms, you can have an operator that removes attributes and an operator that removes rows and apply them twice to get the opposite, but a single operator to do all that seems far-fetched. More power if you can get there.

Off-topicre "an operator that removes attributes and an operator that removes rows and apply them twice to get the opposite", one of the early (1994 IIRC, although I discovered it much later) attempts at update-through-view by Date and McGoveran noted that equivalence, and tried to take advantage of it by 'reverse engineering' a view definition back to its based-on relvars. The attempt failed abjectly, and ended up making stipulations in which that equivalence didn't hold. I.e. update-through-`INTERSECTION`

behaved differently vs update-through-`MINUS`

-of-`MINUS`

.I concede it's difficult to get the stipulations to work out. But it isn't difficult to check whether what you've stipulated actually does works out, when you've started by insisting it must work out.

I nearly abandoned

TTMat that point, on grounds it seemed to be run by idiots.End of off-topic.

Quote from dandl on May 12, 2021, 1:41 pm...

No, I really don't follow. I don't know whether it's the lattices or the toolset but you lost me here. As I see it REMOVE is just AntiJoin.

No AntiJoin (by which I take it you mean AntiSemiJoin) returns a relation with heading same as its left operand. It doesn't remove (nor add) any attributes.

Easy if you start with negation.

I'm not aiming for 'easy'; I'm aiming for expressively complete. Neither am I aiming for a 'party game' of reducing to a single operator (see my recent post). But if there's more than one operator, we need to define how they interact -- and in an equationally complete way. And if you get to a sufficiently robust set of definitions, you find you could anyway do it with only one operator -- that is, providing your domain has a single 'sort'. In contrast, **A** has at least two 'sort's: relations and attribute-comma-lists. Now you need operators to get from a relation to the a-c-l for its heading and back.

Quote from dandl on May 13, 2021, 8:41 am## Only one primitive operator

`InnerUnion`

has some very nice properties for equational reasoning: it is commutative, idempotent, associative -- indeed as is`NatJoin`

. Unfortunately,`InnerUnion`

is not distributive over`NatJoin`

, norvice versa.- But (ANNOUNCE) if we could define
`InnerUnion`

in terms of`NatJoin`

(embedding the definition in the machinery of FOPL with equality)thenwe can reason around equations using both operators.- Furthermore (further ANNOUNCE) if we could define
`PROJECT/REMOVE`

,`NotMatching`

in terms of`NatJoin`

(embedding etc) then we can reason all the way round equations.At this juncture, I'd better point out that this is not intended as an 'executable' algebra any more than

Ais. And it would be horribly inefficient to go about`emptify`

ing relations just to get their heading -- or even to evaluate the relational expression argument to`emptify( )`

! It's an algebra purely for reasoning equationally about the equivalence of queries.No problem down to here, but I see trouble ahead. I was expecting the one operator to be a form of negation (like NAND).

We don't need our initial set of operators to include some form of negation. Consider a definition for dyadic operator arithmetic minus (`-`

), or more relevantly set difference (`\`

):

`x - y =df z s.t. y + z = x`

.`x \ y =df z s.t. (y ∩ x) ∪ z = x & y ∩ z = ∅`

.- Note the definition for set difference needs some tricky conditions because of the possibility
`y`

contains elements not in`x`

; and we must forbid`z`

from containing elements already in`y`

. - Perhaps you're puzzled by the format I've presented definitions in my o.p.? rather than that
`z s.t. ...`

form, I could replace`z`

by its definiendum (substitution of equals) so go

`y + (x - y) = x`

`(y ∩ x) ∪ (x \ y) = x & y ∩ (x \ y) = ∅`

And then I could split that last conjunction into two separate axioms.

This fixes `-`

and `\`

uniquely for all `x, y`

. There's no need to go via monadic minus; indeed we could define `negate x =df 0 - x ≡ z s.t. x + z = 0`

. Because `0`

is identity for `+`

-- just as `DEE`

is identity for `NatJoin`

.

Appendix A claims

`NAND`

could replace only`<AND>, <OR>, <NOT>`

; it doesn't go near project/`REMOVE, COMPOSE`

. So that's a non-starter. (Also it would introduce domain-dependence, which gives me a queasy feeling.)It says:

"In fact, we will show in the next section that we do not really

need RENAME either; thus, we could in fact reduce our algebra

still further to just the two operators REMOVE and either

NOR or NAND (plus TCLOSE)."

I don't know how to break this to you, but parts of Appendix A are **wrong** -- or at least their claims are greatly overstated. This doesn't take away from Appendix A being a great teaching resource to lead learners away from SQL's baroque set of operators and operations to the chill mountain stream of a minimalist algebra.

wrt `REMOVE`

, Appendix A is hand-wavey about how to get from a projection in **Tutorial D** to an equivalent `ALL BUT`

to `REMOVE`

. It is also hand-wavey about how to get from `COMPOSE`

to the needed `REMOVE`

: how does the algebra calculate the attributes in common between two operands being `COMPOSE`

d? (IOW what is a "macro" operator in this sense? That goes back to your extended discussion about macros on another thread, and at least needs some sort of program reflection to run *after* the compiler has inferred the heading of the expressions that are operands to `COMPOSE`

.)

Project/Remove is simply InnerUnion with an empty relation representing the new heading.

Project is that. `REMOVE`

is not. I did provide an equivalent definition of project as a `REMOVE`

of a `REMOVE`

, but that's not sufficient to 'fix' either without some extra machinery to talk about headings-as-sets. Also, as it turns out, I haven't yet fixed how to get an empty relation. ( `(r NAND r)`

would deliver that, but then how do I integrate `NAND`

into the lattice operations? Or perhaps there's another approach which isn't at all lattice-based? You're welcome to try. Making the overall set of operations 'solvable' enough to express side-conditions and infer equivalence of queries raises a very high bar. Appendix A doesn't reach it.)

...

My attempts at such definitions have foundered in the past because they relied on

`DUM`

being defined; and I couldn't obtain a definition for`DUM`

that fixed it uniquely in the structure of relations. (Reminder: we can use in the definitions only relation values unanalysed, and operators on them. There's no ability to manipulate 'headings' or 'bodies' (as sets), because that would introduce additional 'sorts' into the algebra.)The definition for

`DEE`

hadn't suffered that difficulty, because we can fix`DEE`

as the identity element for`NatJoin`

. And it turns out (claim not yet demonstrated) we can define the other operators without mentioning`DUM`

; then define`DUM`

from them:

`DEE =df s s.t. ∀ r. s NatJoin r = r`

.`DUM =df DEE NotMatching DEE`

.`DUM =df ∀ r. (r NotMatching r) REMOVE r`

equivalently`∀ r. (r REMOVE r) NotMatching r`

.- (Yes, that's three equivalent definitions for
`DUM`

-- proven equivalent by equational reasoning.)Seems to me it shouldn't be too hard to Dee and Dum, but only if you can negate.

`DEE`

is easy/doesn't need negate.`DUM = DEE NotMatching DEE`

turns out to have problems/is not unique enough, as I discovered. Lattice algebras tend to presume the structure is distributive, complemented unless you give a strong instruction otherwise -- which is my 'ugly' axiom.I have no idea what this means, or why I should care. My aim was to point out that I don't see any solution with a single primitive unless it's some kind of negation, like NAND. Whether or not you get a 'nice' DUM or not this way, I think you won't get there any other way.

`...`

No, I really don't follow. I don't know whether it's the lattices or the toolset but you lost me here. As I see it REMOVE is just AntiJoin. Easy if you start with negation.

AntiJoin could be defined as the negation of

`MATCHING`

; but`r1 MATCHING r2 = r1 NatJoin r2 {attributes-from-r1}`

, so you still need projectionAntiJoin is cognate with Not Matching: same heading, result is the complement of SemiJoin (aka Matching). I prefer to avoid the distinctive/idiosyncratic TD names for operators, they come with too much baggage.

I prefer to avoid the SQL names, because they come with too much baggage -- especially around `Null`

and duplicate rows. If forced, I prefer 'AntiSemiJoin' to 'AntiJoin' (because the 'Anti' in 'AntiJoin' could be anti- some of the other behaviour of Join). The *TTM*/**Tutorial D** names I find just perfect. I'm using `NatJoin`

rather than bare `JOIN`

, only because we're in latticeland, and `NatJoin`

corresponds to lattice `meet`

.

But I have to say I'm not optimistic about your quest. In highly informal terms, you can have an operator that removes attributes and an operator that removes rows and apply them twice to get the opposite, but a single operator to do all that seems far-fetched. More power if you can get there.

*Off-topic* re "an operator that removes attributes and an operator that removes rows and apply them twice to get the opposite", one of the early (1994 IIRC, although I discovered it much later) attempts at update-through-view by Date and McGoveran noted that equivalence, and tried to take advantage of it by 'reverse engineering' a view definition back to its based-on relvars. The attempt failed abjectly, and ended up making stipulations in which that equivalence didn't hold. I.e. update-through-`INTERSECTION`

behaved differently vs update-through-`MINUS`

-of-`MINUS`

.

I concede it's difficult to get the stipulations to work out. But it isn't difficult to check whether what you've stipulated actually does works out, when you've started by insisting it must work out.

I nearly abandoned *TTM* at that point, on grounds it seemed to be run by idiots. *End of off-topic*.

Quote from dandl on May 14, 2021, 4:14 amQuote from Vadim on May 13, 2021, 4:27 pmQuote from AntC on May 13, 2021, 11:53 amQuote from Vadim on May 12, 2021, 5:28 pmConsider a database with all relations having the same set of attributes. Then, it is a Boolean Algebra.

Since all the relation values with the same heading form a Boolean Algebra (i.e. a distributive, complemented sub-lattice within the bigger structure), we can take any element

`r1`

within it: the sub-sub-lattice from that element (as sub-sub-lattice top) 'down' to the empty relation with that heading is also a Boolean Algebra. That empty relation value is given by`r1 NotMatching r1`

.Sub-Sub-Lattice complement of

`r2`

coincides with`r1 NotMatching r2`

. IOW if`r2 NatJoin s = (r1 NotMatching r1) & r2 InnerUnion s = r1`

then`s = r1 NotMatching r2`

. I've added that as an axiom. Also added the distributivity within the sub-sub-lattice from`r1`

. That still doesn't fix enough, without the 'ugly' axiom.The simplest possible counter example is the database consisting of only two relations: DEE and DUM. Yes, this example is far fetched from database practice, but counterexamples can be trivial/silly.

The other [rather artificial] example is the database with all the empty relations.

Therefore, we are in the realm of Boolean Algebra. There, reduction to a single operation is well known (NAND, NOR), which is not consistent with your presentation.

I think I would be right in saying that a single operator is enough, but it still needs an argument domain, the set of boolean values.

In the same way there might be one, two or more operators in a minimal RA, but they are not responsible for the creation of relational values such as Dee and Dum, which are already part of the domain.

In general, a reduction of the operation set in Universal Algebra to some basic set is understood as representing each derived operation via

composition of basic operatons. In other words, you are not allowed to use the First Order Logic sentenses and specify a constraint on the derived operation this way; you have to exhibit an equality with the derived operation alone on the left hand side.

Quote from Vadim on May 13, 2021, 4:27 pmQuote from AntC on May 13, 2021, 11:53 amQuote from Vadim on May 12, 2021, 5:28 pm`r1`

within it: the sub-sub-lattice from that element (as sub-sub-lattice top) 'down' to the empty relation with that heading is also a Boolean Algebra. That empty relation value is given by`r1 NotMatching r1`

.`r2`

coincides with`r1 NotMatching r2`

. IOW if`r2 NatJoin s = (r1 NotMatching r1) & r2 InnerUnion s = r1`

then`s = r1 NotMatching r2`

. I've added that as an axiom. Also added the distributivity within the sub-sub-lattice from`r1`

. That still doesn't fix enough, without the 'ugly' axiom.The other [rather artificial] example is the database with all the empty relations.

I think I would be right in saying that a single operator is enough, but it still needs an argument domain, the set of boolean values.

In the same way there might be one, two or more operators in a minimal RA, but they are not responsible for the creation of relational values such as Dee and Dum, which are already part of the domain.

In general, a reduction of the operation set in Universal Algebra to some basic set is understood as representing each derived operation via

composition of basic operatons. In other words, you are not allowed to use the First Order Logic sentenses and specify a constraint on the derived operation this way; you have to exhibit an equality with the derived operation alone on the left hand side.

Quote from dandl on May 14, 2021, 7:13 amI don't know how to break this to you, but parts of Appendix A are

wrong-- or at least their claims are greatly overstated. This doesn't take away from Appendix A being a great teaching resource to lead learners away from SQL's baroque set of operators and operations to the chill mountain stream of a minimalist algebra.OK. The way I read App-A is:

- chatty discussion with various asides around the idea of a 'core' RA
- formal definitions for 5 operators: NOT, REMOVE, RENAME, AND, OR (but not TCLOSE)
- a foray into the idea of relcons as s source of new values, but no formal definitions.
I find the formal definitions to be correct, the rest to be thoughts experiments/work in progress. I have completed the work to my satisfaction with formal definitions based on relcons, plus

whileto replace TCLOSE. I have not found any other formal treatment of the entire extended RA, so this is what I work from.wrt

`REMOVE`

, Appendix A is hand-wavey about how to get from a projection inTutorial Dto an equivalent`ALL BUT`

to`REMOVE`

. It is also hand-wavey about how to get from`COMPOSE`

to the needed`REMOVE`

: how does the algebra calculate the attributes in common between two operands being`COMPOSE`

d? (IOW what is a "macro" operator in this sense? That goes back to your extended discussion about macros on another thread, and at least needs some sort of program reflection to runafterthe compiler has inferred the heading of the expressions that are operands to`COMPOSE`

.)It's mostly hand-wavey, but the formal definitions stand on their own.

As to the remainder of your work, you are following a path I don't understand to a destination I cannot see. Best if I just lurk a while.

wrong-- or at least their claims are greatly overstated. This doesn't take away from Appendix A being a great teaching resource to lead learners away from SQL's baroque set of operators and operations to the chill mountain stream of a minimalist algebra.

OK. The way I read App-A is:

- chatty discussion with various asides around the idea of a 'core' RA
- formal definitions for 5 operators: NOT, REMOVE, RENAME, AND, OR (but not TCLOSE)
- a foray into the idea of relcons as s source of new values, but no formal definitions.

I find the formal definitions to be correct, the rest to be thoughts experiments/work in progress. I have completed the work to my satisfaction with formal definitions based on relcons, plus *while* to replace TCLOSE. I have not found any other formal treatment of the entire extended RA, so this is what I work from.

`REMOVE`

, Appendix A is hand-wavey about how to get from a projection inTutorial Dto an equivalent`ALL BUT`

to`REMOVE`

. It is also hand-wavey about how to get from`COMPOSE`

to the needed`REMOVE`

: how does the algebra calculate the attributes in common between two operands being`COMPOSE`

d? (IOW what is a "macro" operator in this sense? That goes back to your extended discussion about macros on another thread, and at least needs some sort of program reflection to runafterthe compiler has inferred the heading of the expressions that are operands to`COMPOSE`

.)

It's mostly hand-wavey, but the formal definitions stand on their own.

As to the remainder of your work, you are following a path I don't understand to a destination I cannot see. Best if I just lurk a while.

Quote from Erwin on May 14, 2021, 4:42 pmQuote from AntC on May 14, 2021, 3:42 am`REMOVE`

, Appendix A is hand-wavey about how to get from a projection inTutorial Dto an equivalent`ALL BUT`

to`REMOVE`

. It is also hand-wavey about how to get from`COMPOSE`

to the needed`REMOVE`

: how does the algebra calculate the attributes in common between two operands being`COMPOSE`

d? (IOW what is a "macro" operator in this sense? That goes back to your extended discussion about macros on another thread, and at least needs some sort of program reflection to runafterthe compiler has inferred the heading of the expressions that are operands to`COMPOSE`

.)The compiler knows about the headings of the operands, and perhaps holds that information in the form of a relation with heading {attributename:attributenamedomain, typename:typenamedomain} and applies, for example, semijoin or semiminus, on the two relation values at hand, to reliably and deterministically compute the results you say they are "hand-wavey" about.

(IOW a "macro" operator in this sense is some operator on some data types that happens to be used by the compiler for doing the job it is supposed to do. There is no implication that within the set of all operators used by either the compiler or any of the application programs, "being a macro operator" (thus giving rise to assessable predicates like "X is a 'macro' operator and Y is too") is a characteristic that creates equivalence classes within that set. Creates >1 distinct equivalence classes, that is.)

That all seems so blatantly obvious and trivial that I'm wondering what the hell the problem is.

Quote from AntC on May 14, 2021, 3:42 am

`REMOVE`

, Appendix A is hand-wavey about how to get from a projection inTutorial Dto an equivalent`ALL BUT`

to`REMOVE`

. It is also hand-wavey about how to get from`COMPOSE`

to the needed`REMOVE`

: how does the algebra calculate the attributes in common between two operands being`COMPOSE`

d? (IOW what is a "macro" operator in this sense? That goes back to your extended discussion about macros on another thread, and at least needs some sort of program reflection to runafterthe compiler has inferred the heading of the expressions that are operands to`COMPOSE`

.)

The compiler knows about the headings of the operands, and perhaps holds that information in the form of a relation with heading {attributename:attributenamedomain, typename:typenamedomain} and applies, for example, semijoin or semiminus, on the two relation values at hand, to reliably and deterministically compute the results you say they are "hand-wavey" about.

(IOW a "macro" operator in this sense is some operator on some data types that happens to be used by the compiler for doing the job it is supposed to do. There is no implication that within the set of all operators used by either the compiler or any of the application programs, "being a macro operator" (thus giving rise to assessable predicates like "X is a 'macro' operator and Y is too") is a characteristic that creates equivalence classes within that set. Creates >1 distinct equivalence classes, that is.)

That all seems so blatantly obvious and trivial that I'm wondering what the hell the problem is.

Quote from Vadim on May 14, 2021, 4:49 pmQuote from AntC on May 14, 2021, 12:36 amQuote from Vadim on May 13, 2021, 4:27 pmThe other [rather artificial] example is the database with all the empty relations.

No that's not a valid lattice structure to model relations: we must always have

`DEE`

, no matter how many empty relations there are. We can always say the sub-lattice from`DUM`

down to lattice bottom must be a Boolean algebra (as the powerset of all possible attribute-type pairs, formed into headings). That is, if we can define`DUM`

.In the database of all the empty relations DUM=DEE, otherwise you'll break some RL axioms.

Yes, it is counterintuitive, because the TTM stipulates DEE containing one tuple, and there is no way to satify this condition with empty relations. (I vaguely remember correspondence with Ervin who pointed out holes in this set based definition.)

In general, a reduction of the operation set in Universal Algebra to some basic set is understood as representing each derived operation via

composition of basic operatons.Please give a citation for that claim. I think you just made it up. Appendix A (tries to) proceed in that way. But actually fails: for example it does not reduce

`RENAME`

to other basic operations -- the other operations still need`RENAME`

or some further mechanism (not specified), as Philip Kelly pointed out many moons ago -- he proposed a mechanism 'equi-relations'.Page 11 here: https://books.google.com/books?id=siNlAn8Bm30C&pg=PA1&source=gbs_toc_r&cad=3#v=onepage&q&f=false

In other words, you are not allowed to use the First Order Logic sentenses and specify a constraint on the derived operation this way; you have to exhibit an equality with the derived operation alone on the left hand side.

No. You are just making it up. I am not trying to produce an 'executable' algebra. I'm aiming for a 'solvable' algebra -- in the sense a theorem prover (or human) can infer if two queries are equivalent (will produce the same result over all possible input values, providing initial assumptions hold -- the algebra must be powerful enough to express those assumptions).

I'm not obsessing about a 'party game' of minimising the count of primitives. If we have many primitives, we still have to provide equations to define how they interact, so we can manipulate query equations to prove equivalences. This is just schoolroom solving of 'simultaneous equations'. In conventional lattice treatments, the interaction of

`meet, join`

is defined via the absorption laws. I merely equivalently defined`InnerUnion`

(lattice`join`

) in terms of`NatJoin`

(lattice`meet`

), to demonstrate the technique before using it to 'fix'`NotMatching`

-- or so I hoped. From my definition of`InnerUnion`

you can mechanically derive the absorption laws, and all the properties of`InnerUnion`

.For example I can give a quadratic equation as

`ax`

^{2}`+ bx + c = 0`

, no "derived operation alone on the left hand side" there (or on either side), and from that derive the solution for`x`

via equational reasoning.Then, it is terminology confusion, and you need to clarify what do you mean by "primitive operator". Analogies will get you only so far.

Quote from AntC on May 14, 2021, 12:36 amQuote from Vadim on May 13, 2021, 4:27 pmThe other [rather artificial] example is the database with all the empty relations.

`DEE`

, no matter how many empty relations there are. We can always say the sub-lattice from`DUM`

down to lattice bottom must be a Boolean algebra (as the powerset of all possible attribute-type pairs, formed into headings). That is, if we can define`DUM`

.

In the database of all the empty relations DUM=DEE, otherwise you'll break some RL axioms.

Yes, it is counterintuitive, because the TTM stipulates DEE containing one tuple, and there is no way to satify this condition with empty relations. (I vaguely remember correspondence with Ervin who pointed out holes in this set based definition.)

composition of basic operatons.`RENAME`

to other basic operations -- the other operations still need`RENAME`

or some further mechanism (not specified), as Philip Kelly pointed out many moons ago -- he proposed a mechanism 'equi-relations'.

Page 11 here: https://books.google.com/books?id=siNlAn8Bm30C&pg=PA1&source=gbs_toc_r&cad=3#v=onepage&q&f=false

`meet, join`

is defined via the absorption laws. I merely equivalently defined`InnerUnion`

(lattice`join`

) in terms of`NatJoin`

(lattice`meet`

), to demonstrate the technique before using it to 'fix'`NotMatching`

-- or so I hoped. From my definition of`InnerUnion`

you can mechanically derive the absorption laws, and all the properties of`InnerUnion`

.`ax`

^{2}`+ bx + c = 0`

, no "derived operation alone on the left hand side" there (or on either side), and from that derive the solution for`x`

via equational reasoning.

Then, it is terminology confusion, and you need to clarify what do you mean by "primitive operator". Analogies will get you only so far.

Quote from Erwin on May 14, 2021, 5:36 pmQuote from Vadim on May 14, 2021, 4:49 pmIn the database of all the empty relations DUM=DEE, otherwise you'll break some RL axioms.

Yes, it is counterintuitive, because the TTM stipulates DEE containing one tuple, and there is no way to satify this condition with empty relations. (I vaguely remember correspondence with Ervin who pointed out holes in this set based definition.)

I do remember having had some correspondence, but I no longer remember any of the detail. That said,

DUM === {} , DEE === {{}}

DUM is the empty set of empty sets, DEE is the set that has the empty set [empty set of tuples, that is] for its member.

Requiring the empty set to be equal to a nonempty set is the most perfect example of mathematical Harakiri I can imagine.

Quote from Vadim on May 14, 2021, 4:49 pmIn the database of all the empty relations DUM=DEE, otherwise you'll break some RL axioms.

I do remember having had some correspondence, but I no longer remember any of the detail. That said,

DUM === {} , DEE === {{}}

DUM is the empty set of empty sets, DEE is the set that has the empty set [empty set of tuples, that is] for its member.

Requiring the empty set to be equal to a nonempty set is the most perfect example of mathematical Harakiri I can imagine.

Quote from Vadim on May 14, 2021, 6:38 pmQuote from Erwin on May 14, 2021, 5:36 pmDUM === {} , DEE === {{}}

DUM is the empty set of empty sets, DEE is the set that has the empty set [empty set of tuples, that is] for its member.

Requiring the empty set to be equal to a nonempty set is the most perfect example of mathematical Harakiri I can imagine.

A relation is a pair, not a set. Therefore, you must have meant

[latex] DUM \leftrightarrow \langle \{\}, \{\}\rangle [/latex]

[latex] DEE \leftrightarrow \langle \{\}, \{ \langle\rangle \}\rangle [/latex]

Then, even if you use Kuratovski pair for binary tuple, you still have to define what is a zilliary tuple. (BTW, Kuratovski pair has virtually no utility in mathematics). Also, in contemporary mathematics you never compare the two sets

verbatim. Most often they are considered to be the "same" up to some equivalence. This idea is manifested in Anthony's presentation, where DEE and DUM are considered as couple of objects "fixed" inside an algebraic structure. Anthony and myself, however, disagree how successful those axiomatic definitions are in "fixing" the DUM, and even whether "fixing" an object is really a big deal.Regarding your critique of Anthony's motivation, you shall not dismiss curiosity factor. I'd suggest that the database theory is still far from being cast in stone. Witness novel query languages appearing:

LARA (AKA LaraDB) which has incorporated some ideas from Relational Lattices.

Graphical Conjunctive Queries. This paper is remarkable because the diagrammatic expression language is well established in quantum computation arena. (See the Coecke & Kissinger textbook for elementary, category-free exposition)

Quote from Erwin on May 14, 2021, 5:36 pmDUM === {} , DEE === {{}}

A relation is a pair, not a set. Therefore, you must have meant

DUM \leftrightarrow \langle \{\}, \{\}\rangle

DEE \leftrightarrow \langle \{\}, \{ \langle\rangle \}\rangle

Then, even if you use Kuratovski pair for binary tuple, you still have to define what is a zilliary tuple. (BTW, Kuratovski pair has virtually no utility in mathematics). Also, in contemporary mathematics you never compare the two sets *verbatim*. Most often they are considered to be the "same" up to some equivalence. This idea is manifested in Anthony's presentation, where DEE and DUM are considered as couple of objects "fixed" inside an algebraic structure. Anthony and myself, however, disagree how successful those axiomatic definitions are in "fixing" the DUM, and even whether "fixing" an object is really a big deal.

Regarding your critique of Anthony's motivation, you shall not dismiss curiosity factor. I'd suggest that the database theory is still far from being cast in stone. Witness novel query languages appearing:

LARA (AKA LaraDB) which has incorporated some ideas from Relational Lattices.

Graphical Conjunctive Queries. This paper is remarkable because the diagrammatic expression language is well established in quantum computation arena. (See the Coecke & Kissinger textbook for elementary, category-free exposition)

Quote from AntC on May 14, 2021, 10:30 pmQuote from Erwin on May 14, 2021, 4:42 pmQuote from AntC on May 14, 2021, 3:42 am`REMOVE`

, Appendix A is hand-wavey about how to get from a projection inTutorial Dto an equivalent`ALL BUT`

to`REMOVE`

. It is also hand-wavey about how to get from`COMPOSE`

to the needed`REMOVE`

: how does the algebra calculate the attributes in common between two operands being`COMPOSE`

d? (IOW what is a "macro" operator in this sense? That goes back to your extended discussion about macros on another thread, and at least needs some sort of program reflection to runafterthe compiler has inferred the heading of the expressions that are operands to`COMPOSE`

.)The compiler knows about the headings of the operands, and perhaps holds that information in the form of a relation with heading {attributename:attributenamedomain, typename:typenamedomain} and applies, for example, semijoin or semiminus, on the two relation values at hand, to reliably and deterministically compute the results you say they are "hand-wavey" about.

(IOW a "macro" operator in this sense is some operator on some data types that happens to be used by the compiler for doing the job it is supposed to do. There is no implication that within the set of all operators used by either the compiler or any of the application programs, "being a macro operator" (thus giving rise to assessable predicates like "X is a 'macro' operator and Y is too") is a characteristic that creates equivalence classes within that set. Creates >1 distinct equivalence classes, that is.)

That all seems so blatantly obvious and trivial that I'm wondering what the hell the problem is.

The problem is you're speaking as an implementor, not as an algebraist. The compiler's process you describe (which is indeed obvious) uses mechanisms for which there's no support in the algebra. Note the operands to

`COMPOSE`

might be arbitrary relational expressions, so an algebra can't 'read off' the attribute names from a relvar declaration.The algebra does not include an operation like

`attribute-names-from( )`

; neither does it specify the result of a`REMOVE`

where the attribute-comma-list includes names not in the heading of`REMOVE`

's relation operand. Neither is there an algebraic operation to get the intersection of two`attribute-names-from( )`

, as needed for`COMPOSE`

's`REMOVE`

.

Quote from Erwin on May 14, 2021, 4:42 pmQuote from AntC on May 14, 2021, 3:42 am`REMOVE`

, Appendix A is hand-wavey about how to get from a projection inTutorial Dto an equivalent`ALL BUT`

to`REMOVE`

. It is also hand-wavey about how to get from`COMPOSE`

to the needed`REMOVE`

: how does the algebra calculate the attributes in common between two operands being`COMPOSE`

d? (IOW what is a "macro" operator in this sense? That goes back to your extended discussion about macros on another thread, and at least needs some sort of program reflection to runafterthe compiler has inferred the heading of the expressions that are operands to`COMPOSE`

.)That all seems so blatantly obvious and trivial that I'm wondering what the hell the problem is.

The problem is you're speaking as an implementor, not as an algebraist. The compiler's process you describe (which is indeed obvious) uses mechanisms for which there's no support in the algebra. Note the operands to `COMPOSE`

might be arbitrary relational expressions, so an algebra can't 'read off' the attribute names from a relvar declaration.

The algebra does not include an operation like `attribute-names-from( )`

; neither does it specify the result of a `REMOVE`

where the attribute-comma-list includes names not in the heading of `REMOVE`

's relation operand. Neither is there an algebraic operation to get the intersection of two `attribute-names-from( )`

, as needed for `COMPOSE`

's `REMOVE`

.