The Forum for Discussion about The Third Manifesto and Related Matters

You need to log in to create posts and topics.

Questions about some relational operator definitions and comments about the DIVIDEBY operation

Quote from Erwin on March 26, 2020, 7:35 pm
Quote from dandl on March 26, 2020, 12:31 pm

they would smile politely and write you off as crazy.

Etre tenu pour un idiot par un imbécile est un savour de fin gourmet.

And even Blaise Pascal "smiled politely and wrote them off as crazy", the people who proposed an arithmetic with numbers that could be negative.  Arguing "I cannot imagine how one could take away 4 things from no things at all and end up with something different than still just no things at all".

Somehow, when we're discussing the finer points of translating an English language request into formal logic, a response in the form of a French aphorism seems inapt. Especially when it's misquoted. And adding a French mathematician to the mix scarcely helps, particularly since that was a raging dispute within mathematics, and this is about making sense of English.

So, what precisely do you say your non-technical end-user English customer is asking for when they request "get the suppliers who supply all purple parts", what answer do you say will meet the requirement they have requested using these words, and why?

Andl - A New Database Language - andl.org
Quote from Erwin on March 26, 2020, 11:50 am
Quote from Greg Klisch on March 25, 2020, 10:25 pm
Quote from Erwin on March 25, 2020, 10:09 pm
Quote from Greg Klisch on March 25, 2020, 7:33 pm

The premises are:

a) If a supplier s supplies a part then there will be a tuple {s,p,..} in relation SP.  This premise follows directly from the predicate for SP.

b) If there is not a tuple for {s,p,...} in relation SP then supplier s does not supply that part p.  This follows from the predicate for SP as well as from the CWA.

c) There is is a tuple for {s,p,..} if and only if supplier s supplies part p.  This follows from a) and b).

It follows from c) that if s supplies a purple part p then there would be a tuple {s,p,...} in SP for that pair of attributes.  There is not such a pair.

Logic error.

Mathematical convention says that FORALL x : P(x) ===== NOT EXISTS x : NOT P(x)

The consequence is that if `NOT EXISTS <some hypothetical purple part>` is true in and of itself, then surely any further restriction ("such that <some given supplier> does _NOT_ supply that purple part") of that empty set will be empty too, so it will have no members, so the `NOT EXISTS` is indeed true, and for the tautology expressed in the equation above, it means that the FORALL must be true.

Get it out of your head that stating a universal quantification _presumes_ the very existence of at least one `x` in the first place.  It does not.

Not so long ago I was pointed to a paper "the traditional square of opposition".  It explains the views that have been held about this issue throughout the centuries ever since Aristofocles (sorry), and the key phrase to remember from that doc went something like "If you ***want*** to make that presumption of the very existence of at least one `x` [as a precondition for your universal quantification to be true], then the device of logic per se can ***easily*** express this by saying   FORALL x : P(x) ===== NOT EXISTS x : NOT P(x)   ***   & EXISTS x   ***.  It's not a problem of logic per se.".  It's a problem of being precise about expressing what your presumptions are.  (And the derived ensuing problem is one of the lengths of the resulting logical formulae and possibility of the "& EXISTS x" part leading to combinatorial explosion of cases if we find we have to apply laws of distributivity of logical AND and OR.)

I'm not sure what I said to make you think that I was presuming that universal qualification required at least one 'x'.  I don't think that.  What I was awkwardly attempting to say was that all of the suppliers in SP (after the RESTRICT operation left no suppliers) actually supplied all of the P(x) which, of course, was empty.

There are no "suppliers in SP".  SP states facts about suppliers supplying some part.

The fact of the matter is just : even if supplier S5 supplies no parts at all, then he must still appear in the answer to the query "get the suppliers who supply all purple parts" in the case where S5 indeed exists and there are indeed no purple parts at all.  That is not for you to decide, that is for you to accept and deal with it.  Logic says so, period.  S5 is a supplier that exists (relvar S says so) and there does not exist any purple part such that S5 does not supply that purple part.  So S5 supplies all purple parts.  And the ***only*** way to achieve that result is by performing some operation that looks at all three of {S, P, SP}.  Have to look at SP because obviously that's where you find whether some given supplier supplies some given part.  Have to look at P because obviously that's the place to go to to figure what that set of "all parts" (/"all purple parts") is.  And you HAVE TO look at S for the reasons both Date and me (and who knows who else) seem consistently to fail to explain, though not for not trying.

I'm just saying this because your case with PP and SPP ***is a different question***.  It appears to be the query "get the suppliers who supply ***some purple part and*** indeed all of them".  If what you're after is exposing an "error" in the treatment of some given specific case, then it is no use to come up with a ***different*** case.

I'm also a bit uncertain whose English it is that [you think] needs "get fixed", but there certainly seems nothing that needs fixing in "get the suppliers who supply all purple parts". And you HAVE TO look at S for the reasons both Date and me (and who knows who else) seem consistently to fail to explain, though not for not trying.

"They added a third relation to the division specifically to support their choice that all suppliers supply non-existent parts."  No they didn't.  And it was not "their choice".  It was a "choice" made by logicians a long time ago.  Date and Codd weren't even born and computers existed only in crazy dreams of crazy people.  It is explained in that "square of opposition" paper.

And it is not the case that "all suppliers supply non-existent parts".  They simply can't.  It is only the case that "all suppliers supply ***all*** non-existant parts".  That's a ***different*** proposition.

And I suspect you are indeed making one of the "wrong" inferences pointed out by the paper : namely that the non-existence of, say, foobars, makes it impossible for suppliers to "supply all foobars" or "all types of foobar" or whatever.  If something doesn't exist, then how can anyone supply all of those somethings ?  Right ?  Well I'd say go read it here : https://plato.stanford.edu/entries/square/

You say: "Have to look at SP because obviously that's where you find whether some given supplier supplies some given part".  And that is exactly the premise from which the solution entails.   The query is specifically about which supplier supplies some given part. And, to me that means specifically the suppliers in SP. It's not about which supplier is capable of supplying some part.  And according to the CWA:  "The closed-world assumption (CWA), in a formal system of logic used for knowledge representation, is the presumption that a statement that is true is also known to be true. Therefore, conversely, what is not currently known to be true, is false." Date has said that his relational algebra adheres to the CWA. Stating that any supplier other than those in SP supply any parts seems to contradict the CWA.

You also say: "And you HAVE TO look at S for the reasons both Date and me (and who knows who else) seem consistently to fail to explain, though not for not trying."  It's true:  I don't see a reason to include S at all.  It's not required for the division.  Although to be exact the RESTRICT operation is not part of the division: it's a separate operation on the operands that will then participate in the division.  Also, the FORALL term does not apply to all suppliers but only those who supply parts.  I understand that there seems to be a philosophical reason whereby one might choose to say all suppliers supply a non-existent part; after all who is to say which is correct: a) they all supply it; or b) no one does.  But, is there a 'mathematical' reason.  I prefer b) because it follows from the predicate for SP and from the premises of the query and it adheres to the CWA.

Quote from Greg Klisch on March 27, 2020, 4:37 am

Stating that any supplier other than those in SP supply any parts seems to contradict the CWA.

 

NO, god dammed.  Nobody, and I repeat NOBODY, is saying that S5 "supplies any parts" if S5 is included in the query result for the question "suppliers who supply all purple parts".  (And yes, the referent in "all suppliers" is indeed just the suppliers, and not "the suppliers who are known to supply any parts".  Had the latter been the intent, then the latter should also have been the wording and anyone who uses the former wording for the latter intent, is him(her)self the one who is making the errors.)

Here is a thinking exercise.  We take the result of the query, and it includes S5.  We EXTEND that result with an RVA whose value is the identities (P# value) of all the parts that are purple (that RVA is thus the empty relation).  We UNGROUP that relation.  What do you think will happen with all those tuples with an empty RVA ?  They won't contribute to the result because there are no P# values to take from the empty RVA value.  So the result of the UNGROUP will be empty.  And what do you think the external predicate of that UNGROUP result is ?  Yes, 'supplier S# supplies purple part P#'.  And there are no tuples so this is saying no one supplies any purple part.  Despite there being suppliers who supply all of them.

And for those interested in what I'd reply to my non-technical user if he asks me for the suppliers who supply all purple parts : I'd ask him whether he really means "all suppliers" or "suppliers known to supply parts".  (That is also the recommendation given -very explicitly- by "Applied Mathematics for Database Professionals" BTW.)  And perhaps I might even explain the difference and why I'm asking and give them a little education along the way.

Greg Klisch wrote: "... The premises are:

"a) If a supplier s supplies a part p then there will be a tuple {s,p,..} in relation SP.  This premise follows directly from the predicate for SP.

"b) If there is not a tuple for {s,p,...} in relation SP then supplier s does not supply that part p.  This follows from the predicate for SP as well as from the CWA.

"c) There is is a tuple for {s,p,..} if and only if supplier s supplies part p.  This follows from a) and b).

"It follows from c) that if s supplies a purple part p then there would be a tuple {s,p,...} in SP for that pair of attributes.  There is not such a pair. …"

The above is a rare argument, at least in this group, against the introduction of imaginary information, bound to be outnumbered by the fantasies of technicians who favour simplistic shell games like the so-called relation divide operators, quoting shallow snippets of something they call "logic" all the way home, contrary to Codd's relational logic, quoting informal predicates constantly without ever writing formal arguments in predicate form aka first order wff's.

More formally, suppose p stands for the set of all purple parts and s for the set of supplied purple parts. So the intersection (p &-s) aka (p And (Not s)) must give the set of unsupplied purple parts and -(p&-s) aka Not(p&-s) must stand for the set of supplied purple parts, or equivalently the argument ( ( Not s ) bi-implies Not (p And Not s).

The corresponding elementary equivalence, aka logically valid argument is (-s=-(p&-s))=(-p&-s). In other words the conclusion is logically valid only when there Exist NO purple parts AND NO supplied purple parts. Obviously the DTATRM division variants must be taking logically invalid shortcuts to get their results, there can be no other explanation, for example, assuming purple parts Exist when they don't exist. 

Just as obviously the equivalence ((s=(p&-s))=(-p&-s)) is also logically valid so Greg K is correct when he writes "...if and only if…". In other words, it must be a premise of the sample db value that the sets of supplied purple parts and unsupplied purple parts are the same set. If the proponents of the DTATRM divide variants went to the trouble of formal symbolic arguments, you can predict that at some point their arguments will switch horses in mid-stream, introducing imaginary premises and imaginary information.

(Apparently George Boole defined disjunction as disjoint disjunction, then others who I presume included de Morgan tried to be more general. Codd's logic was specific to his original theory.)

 

And for those interested in what I'd reply to my non-technical user if he asks me for the suppliers who supply all purple parts : I'd ask him whether he really means "all suppliers" or "suppliers known to supply parts".  (That is also the recommendation given -very explicitly- by "Applied Mathematics for Database Professionals" BTW.)  And perhaps I might even explain the difference and why I'm asking and give them a little education along the way.

And that's the right thing to do. Idiomatic English has ambiguities and this is one of them. Until we get an Orwellian NewSpeak without these ambiguities and persuade everyone to use it, that's a fact of life and IMO we should avoid D&D claims that we have the one solution that is "guaranteed to give the right answer in all cases".

Meanwhile, people get to write long articles on the subject, such as this one: http://www.cs.miami.edu/home/geoff/Courses/CSC648-12S/Content/EnglishToLogic.shtml.

Andl - A New Database Language - andl.org
Quote from Erwin on March 27, 2020, 9:18 am
Quote from Greg Klisch on March 27, 2020, 4:37 am

Stating that any supplier other than those in SP supply any parts seems to contradict the CWA.

 

NO, god dammed.  Nobody, and I repeat NOBODY, is saying that S5 "supplies any parts" if S5 is included in the query result for the question "suppliers who supply all purple parts".  (And yes, the referent in "all suppliers" is indeed just the suppliers, and not "the suppliers who are known to supply any parts".  Had the latter been the intent, then the latter should also have been the wording and anyone who uses the former wording for the latter intent, is him(her)self the one who is making the errors.)

Here is a thinking exercise.  We take the result of the query, and it includes S5.  We EXTEND that result with an RVA whose value is the identities (P# value) of all the parts that are purple (that RVA is thus the empty relation).  We UNGROUP that relation.  What do you think will happen with all those tuples with an empty RVA ?  They won't contribute to the result because there are no P# values to take from the empty RVA value.  So the result of the UNGROUP will be empty.  And what do you think the external predicate of that UNGROUP result is ?  Yes, 'supplier S# supplies purple part P#'.  And there are no tuples so this is saying no one supplies any purple part.  Despite there being suppliers who supply all of them.

And for those interested in what I'd reply to my non-technical user if he asks me for the suppliers who supply all purple parts : I'd ask him whether he really means "all suppliers" or "suppliers known to supply parts".  (That is also the recommendation given -very explicitly- by "Applied Mathematics for Database Professionals" BTW.)  And perhaps I might even explain the difference and why I'm asking and give them a little education along the way.

Did you not just demonstrate, with the GROUP UNGROUP exercise, that the result of the Small Divide query produces a contradiction?  The Small Divide is the operation that produces a result that includes S5.  What I proposed does not include S5 and therefore doesn't produce that contradiction.

So, if having S5 in the result doesn't mean that S5 supplies purple parts, then what does it mean?

Quote from Greg Klisch on March 27, 2020, 7:27 pm

So, if having S5 in the result doesn't mean that S5 supplies purple parts, then what does it mean?

As I've already said dozens of times, it means he supplies all of them.  And that is ***NOT*** the same proposition as "at least one" (and also not "some of them").  And inferring the latter from the former is inferring existence from a universality being considered true and ***THAT*** is the ***LOGICAL ERROR*** that all the talk in "square of opposition" is about.

It means the set of purple parts supplied by S5 (the empty set) is the same as the set of purple parts in existence (the empty set).

It seems to me you fall into the same trap that many people do, to believe that logic is about "making sense".  Forget it.  It isn't.  It's about being coherent.  And logicians much smarter than you and me -and all the rest of us here including the likes of P C (especially those)- have discovered long before we were born that the only way to retain coherence when "empty sets can be involved", is to accept the unintuitive fact that every universal quantification over an empty set must be considered true.

What my exercise was trying to show is that ***THERE IS NO CONTRADICTION***.  If the set of purple parts supplied by S5 is indeed the empty set then the ***ONLY*** way for the relation that is the result of the EXTEND in my exercise to be in accordance with the CWA is to ***INCLUDE*** S5 along with the ***empty set*** of purple parts he does supply.  The CWA means the relation is the extension of the predicate.  If the [external] predicate is "the set of purple parts supplied by <S#> is <attr-name-of-RVA-here>" then the ***ONLY*** way for the extension to accurately reflect the real-world situation is to include S5 alongside an empty relation.  And if supplier S7 does not exist, then supplier S7 should not be included in the result because the very phrasing of the question "get all suppliers" means including only those identifiers that identify a supplier confirmed to be in existence, which is what relvar S does.  "Getting all suppliers" presumably means we can safely leave out the non-suppliers, because including a non-supplier such as S27384 obviously fails to satisfy the requirement, given that, according to the CWA applied to S, S27384 ***is not a supplier***.  But if S5 is somewhere somehow confirmed to indeed be a supplier, and indeed he supplies the same set of purple parts as the set of purple parts that is confirmed to be in existence, then S5 should be included.

And this is where I exit.

Quote from Greg Klisch on March 27, 2020, 4:37 am

The query is specifically about which supplier supplies some given part.

No, it isn't.  The query is specifically about which supplier supplies ***ALL OF*** some ***GIVEN SET*** of parts.

When for Pete's sake are you going to see the ***DIFFERENCE*** between those questions ?

Quote from Erwin on March 27, 2020, 9:04 pm
Quote from Greg Klisch on March 27, 2020, 4:37 am

The query is specifically about which supplier supplies some given part.

No, it isn't.  The query is specifically about which supplier supplies ***ALL OF*** some ***GIVEN SET*** of parts.

When for Pete's sake are you going to see the ***DIFFERENCE*** between those questions ?

I apologize, my sentence is erroneous.  I think the main point on which we are not in agreement is that I believe no suppliers supply the given set of purple parts and you state that all suppliers supply the set of all purple parts.  At least that was I was attempting to demonstrate.

           Earlier you said, "S5 is a supplier that exists (relvar S says so) and there does not exist any purple part such that S5 does not supply that purple part."

As I understand this:  you are saying that because an example can not be provided of a purple part that S5 does not supply, then that proves that S5 (and all of the other suppliers in S) must supply them all.  OK.

It seems I also have a misunderstanding of the CWA.

In any case, thank you for the attempts to educate me.

A similar conundrum arises with aggregation.

  • For each supplier, what was the total weight of purple parts shipped?
  • What was the total weight of purple parts shipped, by supplier?

IMO the main contribution of  DIVIDEPER as against DIVIDEBY is not the cute Small/Great title or the D&D self-righteous tone, but being alert to the inherent ambiguity in the English language query and to the possibility of there being separate answers when the focus is over shipments or over suppliers.

The same ambiguity arises with aggregation: over shipments or over suppliers? SUMMARIZEPER vs SUMMARIZEBY offers a similar choice.

At least that's my takeaway.

Andl - A New Database Language - andl.org