Questions about New Relational Algebra A

#21 · February 27, 2020, 12:18 am

Quote from Erwin on February 26, 2020, 8:40 pm

Greg,

" that there be a corresponding tuple ts in B "

I think this exposes your thinking error in trying to understand <OR>. It's not about "finding" "corresponding" tuples in the other argument. That's what JOIN (aka <AND>) is about.

The condition for <OR> says "imagine any imagineable tuple (of the heading of the result). If its "A part" is in Ra (regardless of what the "B part" is) ***OR*** if its "B part" is in Rb (regardless of what the "A part" is) then that "any imagineable tuple" should be in the result".

The two mentions of "regardless" make it a ***wrong*** strategy to go try and find "corresponding" tuples. That kind of search, because of the "regardless", should be done in those "universal" relations, and those relations being the universal relations, finding those tuples there will just be trivially true (so upfront you already know there is just no point in "go try and find" - you know you will) !

At least you already got close where you said "The creation of the sets of Urx could be prohibitively expensive.". That was the point because the only known strategies that effectively produce the result as defined by the definition, all include "the creation of the sets of Urx".

Thanks, Erwin

But, the error I was making was different than you thought. Because of my obtuseness I made erroneous statements to which you responded. My apology.

#22 · February 27, 2020, 7:55 am

Quote from dandl on February 26, 2020, 11:40 pm

Just for total clarity and to put things into a concrete setting, would someone care to enumerate the tuples that comprise
S {S#,SNAME} ◄AND► P {P#,PNAME}

S {S#,SNAME} ◄OR► P {P#,PNAME}
Or at least a representative sample thereof?

The first would seem to be simply the cross-product (5 * 6 = 30 tuples). But the second? Is is 5 * 6 * 5 + 6 * 5 * 5 = 300 or 5 * 5 * 6 * 5 = 750 tuples, or something else again?

It has been mentioned twice here that A <OR> B === A JOIN UB UNION B JOIN UA.

In there, UB is the universal relation of the type of B, that is, the one that has all possible tuples of the heading of B, that is, the relation whose body is exactly the same set of values as the set of values that is the tuple type with the same heading as B. Ditto for A.

Applied to your suppliers-and-parts query :

A === S {S#,SNAME}

UA === REL{S# S#, SNAME CHAR}{ an insane number GAZA of tuples here}

B === P {P#,PNAME}

UB === REL{P# P#, PNAME CHAR}{ an insane number GAZB of tuples here}

GAZA === cardinality-of-S# * cardinality-of-CHAR

GAZB === cardinality-of-P# * cardinality-of-CHAR

So the tuple count in your <OR> is cardinality-of-S * GAZB + cardinality-of-P * GAZA.

So no, you stand little chance that somebody is going to want to enumerate that.

Author of SIRA_PRISE

#23 · February 27, 2020, 11:43 am

Quote from Hugh on February 27, 2020, 11:43 am

Quote from Greg Klisch on February 26, 2020, 6:53 pm

RE DUM and DEE

I did mistakenly call them relvars; but only because DTATRM defined them in terms of PROJECTions over relvars. But later in DTATRM they are referred to as relcons.

I obtained the idea of relcons TABLE_DEE and TABLE_DUM as having Boolean values from this quote from RM PRESCRIPTION 10: RELATIONS {emphasis is mine}:

As we saw under RM Prescription 9, tuples of degree zero are valid; thus, relations of degree zero are necessarily valid as well. In fact, there are exactly two such: one whose body contains just one tuple—the 0-tuple, of course—and one whose body contains no tuples at all. Following reference [29], we call these two relations, informally, TABLE_DEE and TABLE_DUM, respectively (DEE and DUM for short). What makes them so special is the fact that they can be interpreted as TRUE and FALSE, respectively,…

Perhaps the key word here is “interpreted”?

Further, in Appendix E “Darwen’s approach” to view updating, DEE and DUM are used in an example of SHOP IS OPEN/CLOSED and ALARM IS SET (or not) to represent truth values (or so it appears to me). I’m not going to reproduce that entire example here.

So it was in that role that I mentioned them at all when I asked this question: In what sense do relvar values have Boolean values of TRUE and FALSE? DUM and DEE are the only RA constructs that are mentioned in DTATRM in that context.

As always I am thankful for your patience with my ignorance, AntC, dandl, and Hugh, and for willingness to continue to answer my questions. You have reminded me that the answers are in the DTATRM; but sometimes a rephrasing of information helps me understand.

Thanks for those explanations, understood. In view of what you say, it was prehaps a bit unwise to write "can be interpreted as TRUE and FALSE", but the completion of that sentence is "as we now explain". It's a long-winded explanation that might be a bit demanding. What it's saying is that "interpreted" relies on the fact that DEE contains just one tuple, lacked by DUM. DEE's tuple can perhaps be interpreted as TRUE, in which case DUM's lack can be interpreted as the opposite. But we need to interpret TRUE as the truth of some proposition. Because DEE and DUM have no attributes, whatever predicate they are deemed to represent must have no free variables, in which case that predicate degenerates to being a proposition, and propositions do have truth values. So, if the predicate of a degree zero relation r0 is expressed as "The shop is closed" (with no free variables), then whether r0 is DEE or DUM tells us whether the shop is indeed closed. Is that short-winded enough?

As for the use of relvar names in examples, the reader should be aware that expressions using no update operators cannot alter a value assigned to a referenced variable and then name thus stand for the value only.

Hugh

Quote from Greg Klisch on February 26, 2020, 6:53 pm

RE DUM and DEE

I did mistakenly call them relvars; but only because DTATRM defined them in terms of PROJECTions over relvars. But later in DTATRM they are referred to as relcons.

I obtained the idea of relcons TABLE_DEE and TABLE_DUM as having Boolean values from this quote from RM PRESCRIPTION 10: RELATIONS {emphasis is mine}:

As we saw under RM Prescription 9, tuples of degree zero are valid; thus, relations of degree zero are necessarily valid as well. In fact, there are exactly two such: one whose body contains just one tuple—the 0-tuple, of course—and one whose body contains no tuples at all. Following reference [29], we call these two relations, informally, TABLE_DEE and TABLE_DUM, respectively (DEE and DUM for short). What makes them so special is the fact that they can be interpreted as TRUE and FALSE, respectively,…

Perhaps the key word here is “interpreted”?

Further, in Appendix E “Darwen’s approach” to view updating, DEE and DUM are used in an example of SHOP IS OPEN/CLOSED and ALARM IS SET (or not) to represent truth values (or so it appears to me). I’m not going to reproduce that entire example here.

So it was in that role that I mentioned them at all when I asked this question: In what sense do relvar values have Boolean values of TRUE and FALSE? DUM and DEE are the only RA constructs that are mentioned in DTATRM in that context.

As always I am thankful for your patience with my ignorance, AntC, dandl, and Hugh, and for willingness to continue to answer my questions. You have reminded me that the answers are in the DTATRM; but sometimes a rephrasing of information helps me understand.

Thanks for those explanations, understood. In view of what you say, it was prehaps a bit unwise to write "can be interpreted as TRUE and FALSE", but the completion of that sentence is "as we now explain". It's a long-winded explanation that might be a bit demanding. What it's saying is that "interpreted" relies on the fact that DEE contains just one tuple, lacked by DUM. DEE's tuple can perhaps be interpreted as TRUE, in which case DUM's lack can be interpreted as the opposite. But we need to interpret TRUE as the truth of some proposition. Because DEE and DUM have no attributes, whatever predicate they are deemed to represent must have no free variables, in which case that predicate degenerates to being a proposition, and propositions do have truth values. So, if the predicate of a degree zero relation r0 is expressed as "The shop is closed" (with no free variables), then whether r0 is DEE or DUM tells us whether the shop is indeed closed. Is that short-winded enough?

As for the use of relvar names in examples, the reader should be aware that expressions using no update operators cannot alter a value assigned to a referenced variable and then name thus stand for the value only.

Hugh

Coauthor of The Third Manifesto and related books.

#24 · February 27, 2020, 1:00 pm

So no, you stand little chance that somebody is going to want to enumerate that.

Thank you. That was much clearer, but now I have to ask: what' s the point?

I guess a similar consideration would apply to ◄NOT►, relational values made up of vast quantities of 'virtual' tuples to be weeded out by subsequent operations.

I can do the maths, but I've always liked to check out the intermediate results to see where we're headed. Maybe not this time.

Andl - A New Database Language - andl.org

#25 · February 27, 2020, 2:56 pm

Quote from dandl on February 27, 2020, 1:00 pm

So no, you stand little chance that somebody is going to want to enumerate that.

Thank you. That was much clearer, but now I have to ask: what' s the point?

I guess a similar consideration would apply to ◄NOT►, relational values made up of vast quantities of 'virtual' tuples to be weeded out by subsequent operations.

I can do the maths, but I've always liked to check out the intermediate results to see where we're headed. Maybe not this time.

The point is that the result of "R1 <OR> R2" is the ***only*** relation that truly complies with the CWA and is the extension of the predicate "P(R1) OR P(R2)".

(complies with CWA in the sense that for any absent tuple the corresponding proposition must be false, meaning that if some combo of R1 attribute values makes P(R1) true, then the disjunction is by definition also true, meaning any such R1 tuple must appear "accompanied" by every possible combo of R2 attribute values)

You can easily ask the same question about <NOT>, and that one is even more easily answered : my proofs for the constraint enforcing algorithms in SIRA_PRISE rely on <NOT> (relational complement visavis the universal relation) whenever the operator involved is of the MINUS family.

That such stuff is not intended for actual implementation, does not mean it is not useful for any purpose at all.

Author of SIRA_PRISE

#26 · February 27, 2020, 8:07 pm

Quote from p c on February 27, 2020, 8:07 pm

to: Greg Klisch

As well-intentioned and well-written as Appendix A is, it should be de rigueur to approach it with an understanding of its departures from the 1970 paper. Especially noticeable is the difference between conjoin and Codd’s natural join which creates massive unnecessary use-ability differences.

It’s also important to take posts claiming “I can do the math” with a grain of salt when they are accompanied by the kind of happy-go-lucky posts that an end-users might make, where “facts” trump theory, where a property of data is an unknown concept, where implications apparently can’t be premises of a logical argument and where Codd’s 1972 division which requires union-compatible operands is equated with various TTM expressions that don’t. (Even though the needed mathematical thinking is simple, comments such as “That’s English usage” don’t cut it.)

Note the 1970 warming: “...Most users would not be directly concerned with these operations. Information systems designers and people concerned with data bank control should, however, be thoroughly familiar with them. …”

Note especially section 2.1.3 from 1970 where two relations are defined as joinable without loss of information when they have equal projections on common attributes. Codd distinguishes this as natural join, different from all other joins. In other words he represents data properties with attribute values. (Appendix A dependence on common sets of attributes is just a red herring.)

Although Appendix A has tragic practical consequences because it misunderstands natural join, it is at least consistent in its appeal to de Morgan’s laws. But this is just because de Morgan is prematurely applied. When it is applied in agreement with Codd it takes a different form which is logically valid and the Appendix A form becomes invalid.

In the concise form preferred by some logicians and a tool I prefer natural join has this equivalance

( A=S) = ( -(A&S) = (-A & -S ) ) is valid .

This is not an expression that the average coder will recognize (and apparently some dbms developers too). Operator precedence is the customary logical precedence, & is conjunction, - is negation, and = is mutual implication/bi-implication. Because of poor eyesight I can’t risk more verbose expression forms.

The crucial equivalence is the first parenthesized term which gives Codd’s condition for natural join. It acts as a premise for a succeeding implication which has a premise of a negated conjunction. With suitable rearrangement other forms are possible such as conjunctive premises I just happen to like this form for certain visual advantages.

By contrast, the fAppendix A de Morgan example fails to be valid under Codd’s condition:

(A=S) = (-(A & S) = (-A + -S)) is not valid.

Occasionally I have posted examples of how following Codd’s natural join condition can lead to data designs that permit logically complete databases in the Godel sense, logically valid queries and updates and importantly designs that take much data housekeeping out of the hands of users and into the control of designers. If you swear only by the Appendix A catechism such possibilities are likely to be lost.

I have nothing to say about the phrase “semantically equivalent” in Appendix A.

to: Greg Klisch

As well-intentioned and well-written as Appendix A is, it should be de rigueur to approach it with an understanding of its departures from the 1970 paper. Especially noticeable is the difference between conjoin and Codd’s natural join which creates massive unnecessary use-ability differences.

It’s also important to take posts claiming “I can do the math” with a grain of salt when they are accompanied by the kind of happy-go-lucky posts that an end-users might make, where “facts” trump theory, where a property of data is an unknown concept, where implications apparently can’t be premises of a logical argument and where Codd’s 1972 division which requires union-compatible operands is equated with various TTM expressions that don’t. (Even though the needed mathematical thinking is simple, comments such as “That’s English usage” don’t cut it.)

Note the 1970 warming: “...Most users would not be directly concerned with these operations. Information systems designers and people concerned with data bank control should, however, be thoroughly familiar with them. …”

Note especially section 2.1.3 from 1970 where two relations are defined as joinable without loss of information when they have equal projections on common attributes. Codd distinguishes this as natural join, different from all other joins. In other words he represents data properties with attribute values. (Appendix A dependence on common sets of attributes is just a red herring.)

Although Appendix A has tragic practical consequences because it misunderstands natural join, it is at least consistent in its appeal to de Morgan’s laws. But this is just because de Morgan is prematurely applied. When it is applied in agreement with Codd it takes a different form which is logically valid and the Appendix A form becomes invalid.

In the concise form preferred by some logicians and a tool I prefer natural join has this equivalance

( A=S) = ( -(A&S) = (-A & -S ) ) is valid .

This is not an expression that the average coder will recognize (and apparently some dbms developers too). Operator precedence is the customary logical precedence, & is conjunction, - is negation, and = is mutual implication/bi-implication. Because of poor eyesight I can’t risk more verbose expression forms.

The crucial equivalence is the first parenthesized term which gives Codd’s condition for natural join. It acts as a premise for a succeeding implication which has a premise of a negated conjunction. With suitable rearrangement other forms are possible such as conjunctive premises I just happen to like this form for certain visual advantages.

By contrast, the fAppendix A de Morgan example fails to be valid under Codd’s condition:

(A=S) = (-(A & S) = (-A + -S)) is not valid.

Occasionally I have posted examples of how following Codd’s natural join condition can lead to data designs that permit logically complete databases in the Godel sense, logically valid queries and updates and importantly designs that take much data housekeeping out of the hands of users and into the control of designers. If you swear only by the Appendix A catechism such possibilities are likely to be lost.

I have nothing to say about the phrase “semantically equivalent” in Appendix A.

#27 · February 27, 2020, 10:33 pm

To Hugh and Erwin:

Erwin,

you provided me with the following rewritten condition for the <OR> operator:

Bs = { ts : exists tr1 exists tr2 "(tr1 ∈ Br1 and tr2 ∈ Ur2) or (tr2 ∈ Br2 and tr1 ∈ Ur1) and ts = tr1 union tr2 ) }

With this explanation of Urx : “ In there Urx are the "universal" relations of their respective types (the ones that contain all possible tuples of their types).

At that time you asked me: “Does that actually alter the condition or are those conditions provably equivalent ?” Even though you posed that question as a student exercise, I decided that it is not proper for me to answer that question because Hugh has ownership of the original condition. So, I’m asking it of Hugh.

Hugh,

The definition of the <OR> operator as stated in Appendix A of the DTATRM includes this condition:

Bs = { ts : exists tr1 exists tr2 ( ( tr1 ∈ Br1 or tr2 ∈ Br2 ) and ts = tr1 union tr2 ) }

Question: Is Erwin’s version acceptable to you as a replacement for the version in the DTATRM?

Question: Does Erwin’s Ur1 and Ur2 faithfully represent your meaning of “superset” in this quote from DTATRM regarding the <OR> operator?

The body of s contains every tuple that conforms to the heading of s and is a superset of either some tuple in the body of r1 or some tuple in the body of r2.

#28 · February 28, 2020, 12:06 am

Quote from Erwin on February 27, 2020, 2:56 pm

Quote from dandl on February 27, 2020, 1:00 pm

So no, you stand little chance that somebody is going to want to enumerate that.

Thank you. That was much clearer, but now I have to ask: what' s the point?

I guess a similar consideration would apply to ◄NOT►, relational values made up of vast quantities of 'virtual' tuples to be weeded out by subsequent operations.

I can do the maths, but I've always liked to check out the intermediate results to see where we're headed. Maybe not this time.

The point is that the result of "R1 <OR> R2" is the ***only*** relation that truly complies with the CWA and is the extension of the predicate "P(R1) OR P(R2)".

(complies with CWA in the sense that for any absent tuple the corresponding proposition must be false, meaning that if some combo of R1 attribute values makes P(R1) true, then the disjunction is by definition also true, meaning any such R1 tuple must appear "accompanied" by every possible combo of R2 attribute values)

I get that, but that's not quite my concern. Apart from Boolean, the common attribute types we deal with are all infinite. That means the results of ◄NOT► and generalised ◄OR► will almost always be infinite relations. I was taught to be deeply suspicious of any kind of maths that has an infinity as an intermediate result. The CWA exists precisely for the purpose of avoiding having to deal with infinities. I don't see where and how that is dealt with in App-A.

I can do the maths, I just didn't know I had to. I have read App-A or bits of it many times, but I had not realised that two of the core operators routinely produce infinities and have to be dealt with by recourse to the CWA. An observation to that effect would have been helpful.

Andl - A New Database Language - andl.org

#29 · February 28, 2020, 2:32 am

Quote from p c on February 27, 2020, 8:07 pm

to: Greg Klisch

Paul, you are the only correspondent whose posts appear in 20pt. It's very annoying. Please use 16pt like the rest of us.

Addit: I see you say " Because of poor eyesight ..."; then does that mean you can't read other people's posts? That would explain why your posts are so often disconnected from what they're supposedly responding to. Use the screen zoom: then you can read others' posts as well as your own in the same font size.

... difference between conjoin and Codd’s natural join ...

This thread is talking about Appendix A's <OR> and <NOT>. I don't see how talking about various forms of join is helping anybody. If you want to ride your hobby-horse that everything Codd wrote is beyond reproach, and that everything anybody else writes is defective, please put that on another thread (and preferably on another forum; I believe c.d.t. is still going).

#30 · February 28, 2020, 2:43 am

Quote from AntC on February 28, 2020, 2:43 am

Quote from Erwin on February 27, 2020, 2:56 pm

Quote from dandl on February 27, 2020, 1:00 pm

So no, you stand little chance that somebody is going to want to enumerate that.

Thank you. That was much clearer, but now I have to ask: what' s the point?

I guess a similar consideration would apply to ◄NOT►, relational values made up of vast quantities of 'virtual' tuples to be weeded out by subsequent operations.

I can do the maths, but I've always liked to check out the intermediate results to see where we're headed. Maybe not this time.

The point is that the result of "R1 <OR> R2" is the ***only*** relation that truly complies with the CWA and is the extension of the predicate "P(R1) OR P(R2)".

(complies with CWA in the sense that for any absent tuple the corresponding proposition must be false, meaning that if some combo of R1 attribute values makes P(R1) true, then the disjunction is by definition also true, meaning any such R1 tuple must appear "accompanied" by every possible combo of R2 attribute values)

You can easily ask the same question about <NOT>, and that one is even more easily answered : my proofs for the constraint enforcing algorithms in SIRA_PRISE rely on <NOT> (relational complement visavis the universal relation) whenever the operator involved is of the MINUS family.

That such stuff is not intended for actual implementation, does not mean it is not useful for any purpose at all.

Heh heh well ... I actually implemented <OR>, <NOT>, mostly for the purpose of exploring Hall, Hitchcock, Todd 1975.

I made <NOT>, <OR> (except when its operands had the same heading) return 'algorithmic relations'. I used the type system to distinguish algorithmic from grounded relations. I used type-level functions to determine whether some operator invocation returned an algorithmic or grounded relation. For example:

r <AND> <NOT> s // i.e. r NOT MATCHING S

returns a relation of the same algorithmic/grounded type as r. The rules for inference are laid out in HHT.

To be executable, the overall type of an expression must be grounded; if the compiler inferred algorithmic, it threw a type error.

Quote from Erwin on February 27, 2020, 2:56 pm

Quote from dandl on February 27, 2020, 1:00 pm

So no, you stand little chance that somebody is going to want to enumerate that.

Thank you. That was much clearer, but now I have to ask: what' s the point?

I guess a similar consideration would apply to ◄NOT►, relational values made up of vast quantities of 'virtual' tuples to be weeded out by subsequent operations.

I can do the maths, but I've always liked to check out the intermediate results to see where we're headed. Maybe not this time.

The point is that the result of "R1 <OR> R2" is the ***only*** relation that truly complies with the CWA and is the extension of the predicate "P(R1) OR P(R2)".

(complies with CWA in the sense that for any absent tuple the corresponding proposition must be false, meaning that if some combo of R1 attribute values makes P(R1) true, then the disjunction is by definition also true, meaning any such R1 tuple must appear "accompanied" by every possible combo of R2 attribute values)

You can easily ask the same question about <NOT>, and that one is even more easily answered : my proofs for the constraint enforcing algorithms in SIRA_PRISE rely on <NOT> (relational complement visavis the universal relation) whenever the operator involved is of the MINUS family.

That such stuff is not intended for actual implementation, does not mean it is not useful for any purpose at all.

Heh heh well ... I actually implemented <OR>, <NOT>, mostly for the purpose of exploring Hall, Hitchcock, Todd 1975.

I made <NOT>, <OR> (except when its operands had the same heading) return 'algorithmic relations'. I used the type system to distinguish algorithmic from grounded relations. I used type-level functions to determine whether some operator invocation returned an algorithmic or grounded relation. For example:

r <AND> <NOT> s // i.e. r NOT MATCHING S

returns a relation of the same algorithmic/grounded type as r. The rules for inference are laid out in HHT.

To be executable, the overall type of an expression must be grounded; if the compiler inferred algorithmic, it threw a type error.

TTM Forum

The Forum for Discussion about The Third Manifesto and Related Matters