The Forum for Discussion about The Third Manifesto and Related Matters

Please or Register to create posts and topics.

Questions about some relational operator definitions and comments about the DIVIDEBY operation

PreviousPage 8 of 8
Quote from dandl on March 28, 2020, 12:22 am

A similar conundrum arises with aggregation.

  • For each supplier, what was the total weight of purple parts shipped?
  • What was the total weight of purple parts shipped, by supplier?

IMO the main contribution of  DIVIDEPER as against DIVIDEBY is not the cute Small/Great title or the D&D self-righteous tone, but being alert to the inherent ambiguity in the English language query and to the possibility of there being separate answers when the focus is over shipments or over suppliers.

The same ambiguity arises with aggregation: over shipments or over suppliers? SUMMARIZEPER vs SUMMARIZEBY offers a similar choice.

At least that's my takeaway.

That may be because FORALL ***is*** aggregation (using logical and with truth values of propositions).

Aggregation using addition is an iterated formula 'number1 + number2 + number3 + ... + numberN + 0'.  One operand for each number in the set, the zero for being the identity value, and if the set has no numbers then all that remains is just the zero.

Aggregation using conjunction is an iterated formula 'supplier S5 supplies purple part 1 AND supplier S5 supplies purple part 2 AND ... AND TRUE'.  Completely analogous.  One proposition for each purple part in the set, and if there are no purple parts in the set then all that remains is just TRUE.

Quote from Erwin on March 28, 2020, 11:51 am
Quote from dandl on March 28, 2020, 12:22 am

A similar conundrum arises with aggregation.

  • For each supplier, what was the total weight of purple parts shipped?
  • What was the total weight of purple parts shipped, by supplier?

IMO the main contribution of  DIVIDEPER as against DIVIDEBY is not the cute Small/Great title or the D&D self-righteous tone, but being alert to the inherent ambiguity in the English language query and to the possibility of there being separate answers when the focus is over shipments or over suppliers.

The same ambiguity arises with aggregation: over shipments or over suppliers? SUMMARIZEPER vs SUMMARIZEBY offers a similar choice.

At least that's my takeaway.

That may be because FORALL ***is*** aggregation (using logical and with truth values of propositions).

That was kind of the point, but you do have to make choice what you're aggregating over.

Aggregation using addition is an iterated formula 'number1 + number2 + number3 + ... + numberN + 0'.  One operand for each number in the set, the zero for being the identity value, and if the set has no numbers then all that remains is just the zero.

Aggregation using conjunction is an iterated formula 'supplier S5 supplies purple part 1 AND supplier S5 supplies purple part 2 AND ... AND TRUE'.  Completely analogous.  One proposition for each purple part in the set, and if there are no purple parts in the set then all that remains is just TRUE.

I'm not entirely prepared to accept that, despite TTM RM Pre 6. Iteration strongly implies an ordering, but the result must be the same regardless of ordering. Also, there are many valid examples of aggregation that do not lend themselves readily to iteration. AVG (MEAN) is the commonest example, but there are many others: MEDIAN, MODE, RANGE, etc.

I prefer to think of aggregation as a single operator applied to a set (bag) of values, all at once, to produce a single result. Whether to use iteration or some other approach is a matter of implementation.

With this approach, the question of the result for an empty set is not automatic, but a matter of choice for the individual function.

Andl - A New Database Language - andl.org

Erwin, I agree that I have been trying to solve a different query. There definitely is a logic difference between ‘supplies’ and ‘have supplied’. I reviewed my previous posts in an effort to remember why I changed the wording. I believe now that it was because the DTATRM was proposing a division operation as the solution of the ‘supples’ query. I believed it was not correct. Fairly early in this thread of the forum I suggested: 

         “If so, then would not the answer to the [original] query --quantified or not quantified--'Get suppliers who supply every ... part' simply be (by definition)?:

                     S {S#}."

My point then was simply that ‘All suppliers supply all parts’. No query is required to answer the question. And these statements by Erwin seem to support that:

         “The query is specifically about which supplier supplies ***ALL OF*** some ***GIVEN SET*** of parts. And yes, the referent in "all suppliers" is indeed just the suppliers, and not "the suppliers who             are known to supply any parts".

and

         Nobody, and I repeat NOBODY, is saying that S5 "supplies any parts" if S5 is included in the query result for the question "suppliers who supply all purple parts".  (And yes, the referent in "all   suppliers" is indeed just the suppliers, and not "the suppliers who are known to supply any parts". Had the latter been the intent, then the latter should also have been the wording.

At that time, I was perfectly willing to accept that ‘all suppliers supply all parts’ even if there were none. But, I did not pursue the idea after that statement because no one else agreed with it and I then got confused on the FORALL quantifier. So, I thought I was wrong and needed to find some other resolution to my mental dilemma. I now think that I should have. My belief is this: the solution to the query Get suppliers who supply every purple part” is not any formulation of the division operation. The division operation is a solution for only this query “Get suppliers who have supplied every purple part”.

As support for this idea I repeat your quote regarding the FORALL quantifier:

           "S5 is a supplier that exists (relvar S says so) and there does not exist any purple part such that S5 does not supply that purple part."

As I understand this, you are saying that because an example can not be provided of a purple part that S5 does not supply, then that proves that S5 (and all of the other suppliers in S) must supply them all. I agree. Suppose we add a tuple to relation P for a part with P# = P7 and color = ‘purple’; then do the division. We would get an empty set as a result which means that no suppliers supply the purple part. Which is the correct answer to “have supplied” but not to who ‘supplies’.

Perhaps it was not I who first conflated ‘supplies’ with ‘have supplied’.

P.S.  Thank you for the referral to "The Traditional Square of Opposition".

Quote from Greg Klisch on March 30, 2020, 5:51 pm

Suppose we add a tuple to relation P for a part with P# = P7 and color = ‘purple’; then do the division. We would get an empty set as a result which means that no suppliers supply the purple part. Which is the correct answer to “have supplied” but not to who ‘supplies’.

Perhaps it was not I who first conflated ‘supplies’ with ‘have supplied’.

"We would get an empty set as a result which means that no suppliers supply the purple part."  Yes, but with the not-so-unimportant nitpicking remark that the query still gives just the answer to "suppliers who supply ***ALL*** purple parts" and that your characterisation of the result is correct only by the "coincidence" that because we only added P7, that "set of all purple parts" is just {P7} and therefore the answer to the query is the same answer as to the one of asking who supplies P7.  Just saying this because it is at all times crucially important to just be precise (and when discussing some "the original problem", stay as close to just that original as one can and nothing else).  One can go a long way with that even in natural language, if only one cares.

"Which is the correct answer to “have supplied” but not to who ‘supplies’."  Now we're back onto possible confusions over the precise meaning of the word "supplies" in some -real-world, business- context that just isn't given for the suppliers-and-parts database.  Perhaps this is why D&D have preferred to move on to circles and ellipses, hoping there would be more agreement amongst readers on what is an ellipse and what isn't than they had observed existed amongst readers on what it means to "supply" and what it doesn't mean.  Perhaps not the easiest of exercises, but with the suppliers-and-parts database you just have to take it for granted that the word "supplies" has ***some*** agreed-upon meaning among the real-world users of that database, and try to imagine that this is also the meaning it has to you even if you don't know what it is ...

"Perhaps it was not I who first ..."  Probably not.  I believe I already stated earlier in the thread how regrettable the choice for "supplies" has turned out to be.

Edit to remove erroneous statements:

Erwin,

I finally understand:  "All suppliers supply all parts in the set of parts, that happens to be empty, that satisfy the SELECTion criteria.    And I see that we have to include all the suppliers, relation S, in the query in order to determine which of those do or do not supply the set of parts.

Your use of an RVA example and a re-reading of  chapter 8 of Database Exploration, wherein Date and Darwen explain RVAs, with the aid therein of figures/diagrams helped me visualize that.

Thank you for your persistence in trying to get me to understand that.

Also, thank you to all the other contributors to this thread.

PreviousPage 8 of 8