SUMMARIZE PER, OUTER JOIN and image relations
Quote from dandl on February 15, 2020, 10:34 amSome time back we had some discussion around this topic. The question is how best to deal with queries where particular tuples are 'projected out' or 'joined out', but we want those lost tuples to contribute to the final result. SQL allows preserving the rows but filling in with NULLs; we don't allow that.
What I recall was broadly two main strategies. I had a feeling there were more, but I can't find them.
- Bundle up attributes that might go missing into an RVA, so missing data can be represented as an empty relation.
- Back fill the final query with the missing rows and pre-computed data.
Strategy 1 covers image relations. It also covers SUMMARIZE PER, which generates a temporary RVA behind the scenes. You can see this directly in the expansion offered by DBE. This one suffers from the difficulty that the aggregating function is required to return an acceptable value when passed an empty relation (what's the AVERAGE value of an empty argument list?). [SQL has a slightly different problem: the aggregating function may have to deal with NULLs.]
Strategy 2 is awkward to code, but gets the right result and is immune to the empty relation problem. I've seen variants of this used in SQL.
But are there others? I thought there were, but memory fails me.
As a side note, App-A glosses over the issue of SUMMARIZE, but again whatever the relcon looks like, it must have an RVA as per Strategy 1.
Some time back we had some discussion around this topic. The question is how best to deal with queries where particular tuples are 'projected out' or 'joined out', but we want those lost tuples to contribute to the final result. SQL allows preserving the rows but filling in with NULLs; we don't allow that.
What I recall was broadly two main strategies. I had a feeling there were more, but I can't find them.
- Bundle up attributes that might go missing into an RVA, so missing data can be represented as an empty relation.
- Back fill the final query with the missing rows and pre-computed data.
Strategy 1 covers image relations. It also covers SUMMARIZE PER, which generates a temporary RVA behind the scenes. You can see this directly in the expansion offered by DBE. This one suffers from the difficulty that the aggregating function is required to return an acceptable value when passed an empty relation (what's the AVERAGE value of an empty argument list?). [SQL has a slightly different problem: the aggregating function may have to deal with NULLs.]
Strategy 2 is awkward to code, but gets the right result and is immune to the empty relation problem. I've seen variants of this used in SQL.
But are there others? I thought there were, but memory fails me.
As a side note, App-A glosses over the issue of SUMMARIZE, but again whatever the relcon looks like, it must have an RVA as per Strategy 1.
Quote from Hugh on February 15, 2020, 1:10 pmQuote from dandl on February 15, 2020, 10:34 amSome time back we had some discussion around this topic. The question is how best to deal with queries where particular tuples are 'projected out' or 'joined out', but we want those lost tuples to contribute to the final result. SQL allows preserving the rows but filling in with NULLs; we don't allow that.
What I recall was broadly two main strategies. I had a feeling there were more, but I can't find them.
- Bundle up attributes that might go missing into an RVA, so missing data can be represented as an empty relation.
- Back fill the final query with the missing rows and pre-computed data.
Strategy 1 covers image relations. It also covers SUMMARIZE PER, which generates a temporary RVA behind the scenes. You can see this directly in the expansion offered by DBE. This one suffers from the difficulty that the aggregating function is required to return an acceptable value when passed an empty relation (what's the AVERAGE value of an empty argument list?). [SQL has a slightly different problem: the aggregating function may have to deal with NULLs.]
Strategy 2 is awkward to code, but gets the right result and is immune to the empty relation problem. I've seen variants of this used in SQL.
But are there others? I thought there were, but memory fails me.
As a side note, App-A glosses over the issue of SUMMARIZE, but again whatever the relcon looks like, it must have an RVA as per Strategy 1.
Re strategy 1, I don't really understand "bundle up attributes that might go missing" but in any case I can't see how you think this solves the problem with the likes of AVG.
Re strategy 2, what do you mean by "right answer"?
It seems to me that all cases need the use of r1 UNION r2 in which one of the operands (r2, say) is an ad hoc arrangement generating tuples of the same type as those of r1. Here's my attempt to grapple with the AVG problem:
(EXTEND (SUMMARIZE r1 PER(r2 MATCHING r1) : {x := AVG(y)}) : {rx := rel{tup{x x}}}){ALL BUT x}
UNION
EXTEND (r2 NOT MATCHING r1) : {rx := REL{x RAT}{}}Is that the kind of thing you meant? I tried it out in Rel using my "Beyond the Four Fours" database, with Recalcitrant and Cases for r1 and r2, respectively. This is what I got:
i
INTEGERj
INTEGERk
INTEGERl
INTEGERrx
RELATION
x
RATIONAL1 1 1 1
x
RATIONAL65.1 1 1 1 2
x
RATIONAL72.28571428571429 1 1 1 3
x
RATIONAL78.16666666666667 1 1 1 5
x
RATIONAL66.66666666666667 1 1 1 6
x
RATIONAL90.2 1 1 1 7
x
RATIONAL77.75 1 1 1 8
x
RATIONAL66.85714285714286 1 1 1 9
x
RATIONAL63.666666666666664 1 1 2 2
x
RATIONAL77.0 1 1 3 3
x
RATIONAL91.5 1 1 6 7
x
RATIONAL92.0 1 1 7 7
x
RATIONAL93.4 1 1 8 8
x
RATIONAL95.0 1 2 2 2
x
RATIONAL80.0 1 2 2 7
x
RATIONAL89.0 1 2 2 8
x
RATIONAL93.0 1 2 8 8
x
RATIONAL95.0 1 3 3 3
x
RATIONAL84.0 1 3 3 6
x
RATIONAL85.0 1 3 3 7
x
RATIONAL95.0 1 3 6 7
x
RATIONAL89.0 1 6 6 6
x
RATIONAL91.5 1 6 6 7
x
RATIONAL92.0 1 7 7 7
x
RATIONAL69.7 1 7 7 8
x
RATIONAL70.0 1 8 8 8
x
RATIONAL54.0 2 2 2 2
x
RATIONAL70.76470588235294 2 2 2 3
x
RATIONAL88.66666666666667 2 2 2 4
x
RATIONAL90.0 2 2 2 5
x
RATIONAL81.4 2 2 2 6
x
RATIONAL82.1 2 2 2 7
x
RATIONAL91.5 2 2 2 8
x
RATIONAL95.0 2 2 3 6
x
RATIONAL97.0 2 2 3 7
x
RATIONAL94.0 2 2 5 5
x
RATIONAL73.0 2 2 6 6
x
RATIONAL80.0 2 2 7 7
x
RATIONAL93.25 2 2 8 8
x
RATIONAL86.0 2 3 6 6
x
RATIONAL81.0 2 3 6 8
x
RATIONAL97.0 2 5 5 5
x
RATIONAL82.0 2 5 8 8
x
RATIONAL99.0 2 6 6 6
x
RATIONAL76.77777777777777 2 6 6 7
x
RATIONAL89.0 2 6 6 8
x
RATIONAL62.0 2 6 7 7
x
RATIONAL98.0 2 7 7 7
x
RATIONAL80.83333333333333 2 7 7 8
x
RATIONAL95.0 2 8 8 8
x
RATIONAL71.57142857142857 3 3 3 3
x
RATIONAL80.625 3 3 3 5
x
RATIONAL79.0 3 3 3 6
x
RATIONAL91.42857142857143 3 3 3 7
x
RATIONAL84.14285714285714 3 3 6 6
x
RATIONAL79.83333333333333 3 3 6 7
x
RATIONAL92.0 3 3 7 7
x
RATIONAL87.0 3 5 5 5
x
RATIONAL83.0 3 6 6 6
x
RATIONAL83.25 3 6 6 7
x
RATIONAL93.5 3 6 7 7
x
RATIONAL92.0 3 7 7 7
x
RATIONAL74.5 3 8 8 8
x
RATIONAL56.333333333333336 4 6 6 6
x
RATIONAL84.125 4 6 7 7
x
RATIONAL97.0 4 6 8 8
x
RATIONAL69.0 4 7 7 7
x
RATIONAL83.0 5 5 5 5
x
RATIONAL78.0 5 5 6 6
x
RATIONAL97.0 5 5 7 7
x
RATIONAL94.0 5 6 6 6
x
RATIONAL94.5 5 7 7 7
x
RATIONAL76.66666666666667 6 6 6 6
x
RATIONAL71.28571428571429 6 6 6 7
x
RATIONAL88.5 6 6 6 8
x
RATIONAL59.75 6 6 6 9
x
RATIONAL88.0 6 6 7 7
x
RATIONAL84.8 6 6 8 8
x
RATIONAL69.5 6 6 9 9
x
RATIONAL88.0 6 7 7 7
x
RATIONAL83.63636363636364 6 8 8 8
x
RATIONAL66.6 7 7 7 7
x
RATIONAL54.6 7 7 7 8
x
RATIONAL56.9 7 7 7 9
x
RATIONAL68.0 7 7 8 8
x
RATIONAL44.0 7 8 8 8
x
RATIONAL66.83333333333333 8 8 8 8
x
RATIONAL59.25 8 8 8 9
x
RATIONAL43.0 1 1 1 4
x
RATIONAL1 1 2 3
x
RATIONAL1 1 2 4
x
RATIONAL1 1 2 5
x
RATIONAL1 1 2 6
x
RATIONAL1 1 2 7
x
RATIONAL1 1 2 8
x
RATIONAL1 1 2 9
x
RATIONAL1 1 3 4
x
RATIONAL1 1 3 5
x
RATIONAL1 1 3 6
x
RATIONAL1 1 3 7
x
RATIONAL1 1 3 8
x
RATIONAL1 1 3 9
x
RATIONAL1 1 4 4
x
RATIONAL1 1 4 5
x
RATIONAL1 1 4 6
x
RATIONAL1 1 4 7
x
RATIONAL1 1 4 8
x
RATIONAL1 1 4 9
x
RATIONAL1 1 5 5
x
RATIONAL1 1 5 6
x
RATIONAL1 1 5 7
x
RATIONAL1 1 5 8
x
RATIONAL1 1 5 9
x
RATIONAL1 1 6 6
x
RATIONAL1 1 6 8
x
RATIONAL1 1 6 9
x
RATIONAL1 1 7 8
x
RATIONAL1 1 7 9
x
RATIONAL1 1 8 9
x
RATIONAL1 1 9 9
x
RATIONAL1 2 2 3
x
RATIONAL1 2 2 4
x
RATIONAL1 2 2 5
x
RATIONAL1 2 2 6
x
RATIONAL1 2 2 9
x
RATIONAL1 2 3 3
x
RATIONAL1 2 3 4
x
RATIONAL1 2 3 5
x
RATIONAL1 2 3 6
x
RATIONAL1 2 3 7
x
RATIONAL1 2 3 8
x
RATIONAL1 2 3 9
x
RATIONAL1 2 4 4
x
RATIONAL1 2 4 5
x
RATIONAL1 2 4 6
x
RATIONAL1 2 4 7
x
RATIONAL1 2 4 8
x
RATIONAL1 2 4 9
x
RATIONAL1 2 5 5
x
RATIONAL1 2 5 6
x
RATIONAL1 2 5 7
x
RATIONAL1 2 5 8
x
RATIONAL1 2 5 9
x
RATIONAL1 2 6 6
x
RATIONAL1 2 6 7
x
RATIONAL1 2 6 8
x
RATIONAL1 2 6 9
x
RATIONAL1 2 7 7
x
RATIONAL1 2 7 8
x
RATIONAL1 2 7 9
x
RATIONAL1 2 8 9
x
RATIONAL1 2 9 9
x
RATIONAL1 3 3 4
x
RATIONAL1 3 3 5
x
RATIONAL1 3 3 8
x
RATIONAL1 3 3 9
x
RATIONAL1 3 4 4
x
RATIONAL1 3 4 5
x
RATIONAL1 3 4 6
x
RATIONAL1 3 4 7
x
RATIONAL1 3 4 8
x
RATIONAL1 3 4 9
x
RATIONAL1 3 5 5
x
RATIONAL1 3 5 6
x
RATIONAL1 3 5 7
x
RATIONAL1 3 5 8
x
RATIONAL1 3 5 9
x
RATIONAL1 3 6 6
x
RATIONAL1 3 6 8
x
RATIONAL1 3 6 9
x
RATIONAL1 3 7 7
x
RATIONAL1 3 7 8
x
RATIONAL1 3 7 9
x
RATIONAL1 3 8 8
x
RATIONAL1 3 8 9
x
RATIONAL1 3 9 9
x
RATIONAL1 4 4 4
x
RATIONAL1 4 4 5
x
RATIONAL1 4 4 6
x
RATIONAL1 4 4 7
x
RATIONAL1 4 4 8
x
RATIONAL1 4 4 9
x
RATIONAL1 4 5 5
x
RATIONAL1 4 5 6
x
RATIONAL1 4 5 7
x
RATIONAL1 4 5 8
x
RATIONAL1 4 5 9
x
RATIONAL1 4 6 6
x
RATIONAL1 4 6 7
x
RATIONAL1 4 6 8
x
RATIONAL1 4 6 9
x
RATIONAL1 4 7 7
x
RATIONAL1 4 7 8
x
RATIONAL1 4 7 9
x
RATIONAL1 4 8 8
x
RATIONAL1 4 8 9
x
RATIONAL1 4 9 9
x
RATIONAL1 5 5 5
x
RATIONAL1 5 5 6
x
RATIONAL1 5 5 7
x
RATIONAL1 5 5 8
x
RATIONAL1 5 5 9
x
RATIONAL1 5 6 6
x
RATIONAL1 5 6 7
x
RATIONAL1 5 6 8
x
RATIONAL1 5 6 9
x
RATIONAL1 5 7 7
x
RATIONAL1 5 7 8
x
RATIONAL1 5 7 9
x
RATIONAL1 5 8 8
x
RATIONAL1 5 8 9
x
RATIONAL1 5 9 9
x
RATIONAL1 6 6 8
x
RATIONAL1 6 6 9
x
RATIONAL1 6 7 7
x
RATIONAL1 6 7 8
x
RATIONAL1 6 7 9
x
RATIONAL1 6 8 8
x
RATIONAL1 6 8 9
x
RATIONAL1 6 9 9
x
RATIONAL1 7 7 9
x
RATIONAL1 7 8 8
x
RATIONAL1 7 8 9
x
RATIONAL1 7 9 9
x
RATIONAL1 8 8 9
x
RATIONAL1 8 9 9
x
RATIONAL1 9 9 9
x
RATIONAL2 2 2 9
x
RATIONAL2 2 3 3
x
RATIONAL2 2 3 4
x
RATIONAL2 2 3 5
x
RATIONAL2 2 3 8
x
RATIONAL2 2 3 9
x
RATIONAL2 2 4 4
x
RATIONAL2 2 4 5
x
RATIONAL2 2 4 6
x
RATIONAL2 2 4 7
x
RATIONAL2 2 4 8
x
RATIONAL2 2 4 9
x
RATIONAL2 2 5 6
x
RATIONAL2 2 5 7
x
RATIONAL2 2 5 8
x
RATIONAL2 2 5 9
x
RATIONAL2 2 6 7
x
RATIONAL2 2 6 8
x
RATIONAL2 2 6 9
x
RATIONAL2 2 7 8
x
RATIONAL2 2 7 9
x
RATIONAL2 2 8 9
x
RATIONAL2 2 9 9
x
RATIONAL2 3 3 3
x
RATIONAL2 3 3 4
x
RATIONAL2 3 3 5
x
RATIONAL2 3 3 6
x
RATIONAL2 3 3 7
x
RATIONAL2 3 3 8
x
RATIONAL2 3 3 9
x
RATIONAL2 3 4 4
x
RATIONAL2 3 4 5
x
RATIONAL2 3 4 6
x
RATIONAL2 3 4 7
x
RATIONAL2 3 4 8
x
RATIONAL2 3 4 9
x
RATIONAL2 3 5 5
x
RATIONAL2 3 5 6
x
RATIONAL2 3 5 7
x
RATIONAL2 3 5 8
x
RATIONAL2 3 5 9
x
RATIONAL2 3 6 7
x
RATIONAL2 3 6 9
x
RATIONAL2 3 7 7
x
RATIONAL2 3 7 8
x
RATIONAL2 3 7 9
x
RATIONAL2 3 8 8
x
RATIONAL2 3 8 9
x
RATIONAL2 3 9 9
x
RATIONAL2 4 4 4
x
RATIONAL2 4 4 5
x
RATIONAL2 4 4 6
x
RATIONAL2 4 4 7
x
RATIONAL2 4 4 8
x
RATIONAL2 4 4 9
x
RATIONAL2 4 5 5
x
RATIONAL2 4 5 6
x
RATIONAL2 4 5 7
x
RATIONAL2 4 5 8
x
RATIONAL2 4 5 9
x
RATIONAL2 4 6 6
x
RATIONAL2 4 6 7
x
RATIONAL2 4 6 8
x
RATIONAL2 4 6 9
x
RATIONAL2 4 7 7
x
RATIONAL2 4 7 8
x
RATIONAL2 4 7 9
x
RATIONAL2 4 8 8
x
RATIONAL2 4 8 9
x
RATIONAL2 4 9 9
x
RATIONAL2 5 5 6
x
RATIONAL2 5 5 7
x
RATIONAL2 5 5 8
x
RATIONAL2 5 5 9
x
RATIONAL2 5 6 6
x
RATIONAL2 5 6 7
x
RATIONAL2 5 6 8
x
RATIONAL2 5 6 9
x
RATIONAL2 5 7 7
x
RATIONAL2 5 7 8
x
RATIONAL2 5 7 9
x
RATIONAL2 5 8 9
x
RATIONAL2 5 9 9
x
RATIONAL2 6 6 9
x
RATIONAL2 6 7 8
x
RATIONAL2 6 7 9
x
RATIONAL2 6 8 8
x
RATIONAL2 6 8 9
x
RATIONAL2 6 9 9
x
RATIONAL2 7 7 9
x
RATIONAL2 7 8 8
x
RATIONAL2 7 8 9
x
RATIONAL2 7 9 9
x
RATIONAL2 8 8 9
x
RATIONAL2 8 9 9
x
RATIONAL2 9 9 9
x
RATIONAL3 3 3 4
x
RATIONAL3 3 3 8
x
RATIONAL3 3 3 9
x
RATIONAL3 3 4 4
x
RATIONAL3 3 4 5
x
RATIONAL3 3 4 6
x
RATIONAL3 3 4 7
x
RATIONAL3 3 4 8
x
RATIONAL3 3 4 9
x
RATIONAL3 3 5 5
x
RATIONAL3 3 5 6
x
RATIONAL3 3 5 7
x
RATIONAL3 3 5 8
x
RATIONAL3 3 5 9
x
RATIONAL3 3 6 8
x
RATIONAL3 3 6 9
x
RATIONAL3 3 7 8
x
RATIONAL3 3 7 9
x
RATIONAL3 3 8 8
x
RATIONAL3 3 8 9
x
RATIONAL3 3 9 9
x
RATIONAL3 4 4 4
x
RATIONAL3 4 4 5
x
RATIONAL3 4 4 6
x
RATIONAL3 4 4 7
x
RATIONAL3 4 4 8
x
RATIONAL3 4 4 9
x
RATIONAL3 4 5 5
x
RATIONAL3 4 5 6
x
RATIONAL3 4 5 7
x
RATIONAL3 4 5 8
x
RATIONAL3 4 5 9
x
RATIONAL3 4 6 6
x
RATIONAL3 4 6 7
x
RATIONAL3 4 6 8
x
RATIONAL3 4 6 9
x
RATIONAL3 4 7 7
x
RATIONAL3 4 7 8
x
RATIONAL3 4 7 9
x
RATIONAL3 4 8 8
x
RATIONAL3 4 8 9
x
RATIONAL3 4 9 9
x
RATIONAL3 5 5 6
x
RATIONAL3 5 5 7
x
RATIONAL3 5 5 8
x
RATIONAL3 5 5 9
x
RATIONAL3 5 6 6
x
RATIONAL3 5 6 7
x
RATIONAL3 5 6 8
x
RATIONAL3 5 6 9
x
RATIONAL3 5 7 7
x
RATIONAL3 5 7 8
x
RATIONAL3 5 7 9
x
RATIONAL3 5 8 8
x
RATIONAL3 5 8 9
x
RATIONAL3 5 9 9
x
RATIONAL3 6 6 8
x
RATIONAL3 6 6 9
x
RATIONAL3 6 7 8
x
RATIONAL3 6 7 9
x
RATIONAL3 6 8 8
x
RATIONAL3 6 8 9
x
RATIONAL3 6 9 9
x
RATIONAL3 7 7 8
x
RATIONAL3 7 7 9
x
RATIONAL3 7 8 8
x
RATIONAL3 7 8 9
x
RATIONAL3 7 9 9
x
RATIONAL3 8 8 9
x
RATIONAL3 8 9 9
x
RATIONAL3 9 9 9
x
RATIONAL4 4 4 4
x
RATIONAL4 4 4 5
x
RATIONAL4 4 4 6
x
RATIONAL4 4 4 7
x
RATIONAL4 4 4 8
x
RATIONAL4 4 4 9
x
RATIONAL4 4 5 5
x
RATIONAL4 4 5 6
x
RATIONAL4 4 5 7
x
RATIONAL4 4 5 8
x
RATIONAL4 4 5 9
x
RATIONAL4 4 6 6
x
RATIONAL4 4 6 7
x
RATIONAL4 4 6 8
x
RATIONAL4 4 6 9
x
RATIONAL4 4 7 7
x
RATIONAL4 4 7 8
x
RATIONAL4 4 7 9
x
RATIONAL4 4 8 8
x
RATIONAL4 4 8 9
x
RATIONAL4 4 9 9
x
RATIONAL4 5 5 5
x
RATIONAL4 5 5 6
x
RATIONAL4 5 5 7
x
RATIONAL4 5 5 8
x
RATIONAL4 5 5 9
x
RATIONAL4 5 6 6
x
RATIONAL4 5 6 7
x
RATIONAL4 5 6 8
x
RATIONAL4 5 6 9
x
RATIONAL4 5 7 7
x
RATIONAL4 5 7 8
x
RATIONAL4 5 7 9
x
RATIONAL4 5 8 8
x
RATIONAL4 5 8 9
x
RATIONAL4 5 9 9
x
RATIONAL4 6 6 7
x
RATIONAL4 6 6 8
x
RATIONAL4 6 6 9
x
RATIONAL4 6 7 8
x
RATIONAL4 6 7 9
x
RATIONAL4 6 8 9
x
RATIONAL4 6 9 9
x
RATIONAL4 7 7 8
x
RATIONAL4 7 7 9
x
RATIONAL4 7 8 8
x
RATIONAL4 7 8 9
x
RATIONAL4 7 9 9
x
RATIONAL4 8 8 8
x
RATIONAL4 8 8 9
x
RATIONAL4 8 9 9
x
RATIONAL4 9 9 9
x
RATIONAL5 5 5 6
x
RATIONAL5 5 5 7
x
RATIONAL5 5 5 8
x
RATIONAL5 5 5 9
x
RATIONAL5 5 6 7
x
RATIONAL5 5 6 8
x
RATIONAL5 5 6 9
x
RATIONAL5 5 7 8
x
RATIONAL5 5 7 9
x
RATIONAL5 5 8 8
x
RATIONAL5 5 8 9
x
RATIONAL5 5 9 9
x
RATIONAL5 6 6 7
x
RATIONAL5 6 6 8
x
RATIONAL5 6 6 9
x
RATIONAL5 6 7 7
x
RATIONAL5 6 7 8
x
RATIONAL5 6 7 9
x
RATIONAL5 6 8 8
x
RATIONAL5 6 8 9
x
RATIONAL5 6 9 9
x
RATIONAL5 7 7 8
x
RATIONAL5 7 7 9
x
RATIONAL5 7 8 8
x
RATIONAL5 7 8 9
x
RATIONAL5 7 9 9
x
RATIONAL5 8 8 8
x
RATIONAL5 8 8 9
x
RATIONAL5 8 9 9
x
RATIONAL5 9 9 9
x
RATIONAL6 6 7 8
x
RATIONAL6 6 7 9
x
RATIONAL6 6 8 9
x
RATIONAL6 7 7 8
x
RATIONAL6 7 7 9
x
RATIONAL6 7 8 8
x
RATIONAL6 7 8 9
x
RATIONAL6 7 9 9
x
RATIONAL6 8 8 9
x
RATIONAL6 8 9 9
x
RATIONAL6 9 9 9
x
RATIONAL7 7 8 9
x
RATIONAL7 7 9 9
x
RATIONAL7 8 8 9
x
RATIONAL7 8 9 9
x
RATIONAL7 9 9 9
x
RATIONAL8 8 9 9
x
RATIONAL8 9 9 9
x
RATIONAL9 9 9 9
x
RATIONALHugh
Quote from dandl on February 15, 2020, 10:34 amSome time back we had some discussion around this topic. The question is how best to deal with queries where particular tuples are 'projected out' or 'joined out', but we want those lost tuples to contribute to the final result. SQL allows preserving the rows but filling in with NULLs; we don't allow that.
What I recall was broadly two main strategies. I had a feeling there were more, but I can't find them.
- Bundle up attributes that might go missing into an RVA, so missing data can be represented as an empty relation.
- Back fill the final query with the missing rows and pre-computed data.
Strategy 1 covers image relations. It also covers SUMMARIZE PER, which generates a temporary RVA behind the scenes. You can see this directly in the expansion offered by DBE. This one suffers from the difficulty that the aggregating function is required to return an acceptable value when passed an empty relation (what's the AVERAGE value of an empty argument list?). [SQL has a slightly different problem: the aggregating function may have to deal with NULLs.]
Strategy 2 is awkward to code, but gets the right result and is immune to the empty relation problem. I've seen variants of this used in SQL.
But are there others? I thought there were, but memory fails me.
As a side note, App-A glosses over the issue of SUMMARIZE, but again whatever the relcon looks like, it must have an RVA as per Strategy 1.
Re strategy 1, I don't really understand "bundle up attributes that might go missing" but in any case I can't see how you think this solves the problem with the likes of AVG.
Re strategy 2, what do you mean by "right answer"?
It seems to me that all cases need the use of r1 UNION r2 in which one of the operands (r2, say) is an ad hoc arrangement generating tuples of the same type as those of r1. Here's my attempt to grapple with the AVG problem:
(EXTEND (SUMMARIZE r1 PER(r2 MATCHING r1) : {x := AVG(y)}) : {rx := rel{tup{x x}}}){ALL BUT x}
UNION
EXTEND (r2 NOT MATCHING r1) : {rx := REL{x RAT}{}}
Is that the kind of thing you meant? I tried it out in Rel using my "Beyond the Four Fours" database, with Recalcitrant and Cases for r1 and r2, respectively. This is what I got:
i INTEGER |
j INTEGER |
k INTEGER |
l INTEGER |
rx RELATION
|
||
---|---|---|---|---|---|---|
1 | 1 | 1 | 1 |
|
||
1 | 1 | 1 | 2 |
|
||
1 | 1 | 1 | 3 |
|
||
1 | 1 | 1 | 5 |
|
||
1 | 1 | 1 | 6 |
|
||
1 | 1 | 1 | 7 |
|
||
1 | 1 | 1 | 8 |
|
||
1 | 1 | 1 | 9 |
|
||
1 | 1 | 2 | 2 |
|
||
1 | 1 | 3 | 3 |
|
||
1 | 1 | 6 | 7 |
|
||
1 | 1 | 7 | 7 |
|
||
1 | 1 | 8 | 8 |
|
||
1 | 2 | 2 | 2 |
|
||
1 | 2 | 2 | 7 |
|
||
1 | 2 | 2 | 8 |
|
||
1 | 2 | 8 | 8 |
|
||
1 | 3 | 3 | 3 |
|
||
1 | 3 | 3 | 6 |
|
||
1 | 3 | 3 | 7 |
|
||
1 | 3 | 6 | 7 |
|
||
1 | 6 | 6 | 6 |
|
||
1 | 6 | 6 | 7 |
|
||
1 | 7 | 7 | 7 |
|
||
1 | 7 | 7 | 8 |
|
||
1 | 8 | 8 | 8 |
|
||
2 | 2 | 2 | 2 |
|
||
2 | 2 | 2 | 3 |
|
||
2 | 2 | 2 | 4 |
|
||
2 | 2 | 2 | 5 |
|
||
2 | 2 | 2 | 6 |
|
||
2 | 2 | 2 | 7 |
|
||
2 | 2 | 2 | 8 |
|
||
2 | 2 | 3 | 6 |
|
||
2 | 2 | 3 | 7 |
|
||
2 | 2 | 5 | 5 |
|
||
2 | 2 | 6 | 6 |
|
||
2 | 2 | 7 | 7 |
|
||
2 | 2 | 8 | 8 |
|
||
2 | 3 | 6 | 6 |
|
||
2 | 3 | 6 | 8 |
|
||
2 | 5 | 5 | 5 |
|
||
2 | 5 | 8 | 8 |
|
||
2 | 6 | 6 | 6 |
|
||
2 | 6 | 6 | 7 |
|
||
2 | 6 | 6 | 8 |
|
||
2 | 6 | 7 | 7 |
|
||
2 | 7 | 7 | 7 |
|
||
2 | 7 | 7 | 8 |
|
||
2 | 8 | 8 | 8 |
|
||
3 | 3 | 3 | 3 |
|
||
3 | 3 | 3 | 5 |
|
||
3 | 3 | 3 | 6 |
|
||
3 | 3 | 3 | 7 |
|
||
3 | 3 | 6 | 6 |
|
||
3 | 3 | 6 | 7 |
|
||
3 | 3 | 7 | 7 |
|
||
3 | 5 | 5 | 5 |
|
||
3 | 6 | 6 | 6 |
|
||
3 | 6 | 6 | 7 |
|
||
3 | 6 | 7 | 7 |
|
||
3 | 7 | 7 | 7 |
|
||
3 | 8 | 8 | 8 |
|
||
4 | 6 | 6 | 6 |
|
||
4 | 6 | 7 | 7 |
|
||
4 | 6 | 8 | 8 |
|
||
4 | 7 | 7 | 7 |
|
||
5 | 5 | 5 | 5 |
|
||
5 | 5 | 6 | 6 |
|
||
5 | 5 | 7 | 7 |
|
||
5 | 6 | 6 | 6 |
|
||
5 | 7 | 7 | 7 |
|
||
6 | 6 | 6 | 6 |
|
||
6 | 6 | 6 | 7 |
|
||
6 | 6 | 6 | 8 |
|
||
6 | 6 | 6 | 9 |
|
||
6 | 6 | 7 | 7 |
|
||
6 | 6 | 8 | 8 |
|
||
6 | 6 | 9 | 9 |
|
||
6 | 7 | 7 | 7 |
|
||
6 | 8 | 8 | 8 |
|
||
7 | 7 | 7 | 7 |
|
||
7 | 7 | 7 | 8 |
|
||
7 | 7 | 7 | 9 |
|
||
7 | 7 | 8 | 8 |
|
||
7 | 8 | 8 | 8 |
|
||
8 | 8 | 8 | 8 |
|
||
8 | 8 | 8 | 9 |
|
||
1 | 1 | 1 | 4 |
|
||
1 | 1 | 2 | 3 |
|
||
1 | 1 | 2 | 4 |
|
||
1 | 1 | 2 | 5 |
|
||
1 | 1 | 2 | 6 |
|
||
1 | 1 | 2 | 7 |
|
||
1 | 1 | 2 | 8 |
|
||
1 | 1 | 2 | 9 |
|
||
1 | 1 | 3 | 4 |
|
||
1 | 1 | 3 | 5 |
|
||
1 | 1 | 3 | 6 |
|
||
1 | 1 | 3 | 7 |
|
||
1 | 1 | 3 | 8 |
|
||
1 | 1 | 3 | 9 |
|
||
1 | 1 | 4 | 4 |
|
||
1 | 1 | 4 | 5 |
|
||
1 | 1 | 4 | 6 |
|
||
1 | 1 | 4 | 7 |
|
||
1 | 1 | 4 | 8 |
|
||
1 | 1 | 4 | 9 |
|
||
1 | 1 | 5 | 5 |
|
||
1 | 1 | 5 | 6 |
|
||
1 | 1 | 5 | 7 |
|
||
1 | 1 | 5 | 8 |
|
||
1 | 1 | 5 | 9 |
|
||
1 | 1 | 6 | 6 |
|
||
1 | 1 | 6 | 8 |
|
||
1 | 1 | 6 | 9 |
|
||
1 | 1 | 7 | 8 |
|
||
1 | 1 | 7 | 9 |
|
||
1 | 1 | 8 | 9 |
|
||
1 | 1 | 9 | 9 |
|
||
1 | 2 | 2 | 3 |
|
||
1 | 2 | 2 | 4 |
|
||
1 | 2 | 2 | 5 |
|
||
1 | 2 | 2 | 6 |
|
||
1 | 2 | 2 | 9 |
|
||
1 | 2 | 3 | 3 |
|
||
1 | 2 | 3 | 4 |
|
||
1 | 2 | 3 | 5 |
|
||
1 | 2 | 3 | 6 |
|
||
1 | 2 | 3 | 7 |
|
||
1 | 2 | 3 | 8 |
|
||
1 | 2 | 3 | 9 |
|
||
1 | 2 | 4 | 4 |
|
||
1 | 2 | 4 | 5 |
|
||
1 | 2 | 4 | 6 |
|
||
1 | 2 | 4 | 7 |
|
||
1 | 2 | 4 | 8 |
|
||
1 | 2 | 4 | 9 |
|
||
1 | 2 | 5 | 5 |
|
||
1 | 2 | 5 | 6 |
|
||
1 | 2 | 5 | 7 |
|
||
1 | 2 | 5 | 8 |
|
||
1 | 2 | 5 | 9 |
|
||
1 | 2 | 6 | 6 |
|
||
1 | 2 | 6 | 7 |
|
||
1 | 2 | 6 | 8 |
|
||
1 | 2 | 6 | 9 |
|
||
1 | 2 | 7 | 7 |
|
||
1 | 2 | 7 | 8 |
|
||
1 | 2 | 7 | 9 |
|
||
1 | 2 | 8 | 9 |
|
||
1 | 2 | 9 | 9 |
|
||
1 | 3 | 3 | 4 |
|
||
1 | 3 | 3 | 5 |
|
||
1 | 3 | 3 | 8 |
|
||
1 | 3 | 3 | 9 |
|
||
1 | 3 | 4 | 4 |
|
||
1 | 3 | 4 | 5 |
|
||
1 | 3 | 4 | 6 |
|
||
1 | 3 | 4 | 7 |
|
||
1 | 3 | 4 | 8 |
|
||
1 | 3 | 4 | 9 |
|
||
1 | 3 | 5 | 5 |
|
||
1 | 3 | 5 | 6 |
|
||
1 | 3 | 5 | 7 |
|
||
1 | 3 | 5 | 8 |
|
||
1 | 3 | 5 | 9 |
|
||
1 | 3 | 6 | 6 |
|
||
1 | 3 | 6 | 8 |
|
||
1 | 3 | 6 | 9 |
|
||
1 | 3 | 7 | 7 |
|
||
1 | 3 | 7 | 8 |
|
||
1 | 3 | 7 | 9 |
|
||
1 | 3 | 8 | 8 |
|
||
1 | 3 | 8 | 9 |
|
||
1 | 3 | 9 | 9 |
|
||
1 | 4 | 4 | 4 |
|
||
1 | 4 | 4 | 5 |
|
||
1 | 4 | 4 | 6 |
|
||
1 | 4 | 4 | 7 |
|
||
1 | 4 | 4 | 8 |
|
||
1 | 4 | 4 | 9 |
|
||
1 | 4 | 5 | 5 |
|
||
1 | 4 | 5 | 6 |
|
||
1 | 4 | 5 | 7 |
|
||
1 | 4 | 5 | 8 |
|
||
1 | 4 | 5 | 9 |
|
||
1 | 4 | 6 | 6 |
|
||
1 | 4 | 6 | 7 |
|
||
1 | 4 | 6 | 8 |
|
||
1 | 4 | 6 | 9 |
|
||
1 | 4 | 7 | 7 |
|
||
1 | 4 | 7 | 8 |
|
||
1 | 4 | 7 | 9 |
|
||
1 | 4 | 8 | 8 |
|
||
1 | 4 | 8 | 9 |
|
||
1 | 4 | 9 | 9 |
|
||
1 | 5 | 5 | 5 |
|
||
1 | 5 | 5 | 6 |
|
||
1 | 5 | 5 | 7 |
|
||
1 | 5 | 5 | 8 |
|
||
1 | 5 | 5 | 9 |
|
||
1 | 5 | 6 | 6 |
|
||
1 | 5 | 6 | 7 |
|
||
1 | 5 | 6 | 8 |
|
||
1 | 5 | 6 | 9 |
|
||
1 | 5 | 7 | 7 |
|
||
1 | 5 | 7 | 8 |
|
||
1 | 5 | 7 | 9 |
|
||
1 | 5 | 8 | 8 |
|
||
1 | 5 | 8 | 9 |
|
||
1 | 5 | 9 | 9 |
|
||
1 | 6 | 6 | 8 |
|
||
1 | 6 | 6 | 9 |
|
||
1 | 6 | 7 | 7 |
|
||
1 | 6 | 7 | 8 |
|
||
1 | 6 | 7 | 9 |
|
||
1 | 6 | 8 | 8 |
|
||
1 | 6 | 8 | 9 |
|
||
1 | 6 | 9 | 9 |
|
||
1 | 7 | 7 | 9 |
|
||
1 | 7 | 8 | 8 |
|
||
1 | 7 | 8 | 9 |
|
||
1 | 7 | 9 | 9 |
|
||
1 | 8 | 8 | 9 |
|
||
1 | 8 | 9 | 9 |
|
||
1 | 9 | 9 | 9 |
|
||
2 | 2 | 2 | 9 |
|
||
2 | 2 | 3 | 3 |
|
||
2 | 2 | 3 | 4 |
|
||
2 | 2 | 3 | 5 |
|
||
2 | 2 | 3 | 8 |
|
||
2 | 2 | 3 | 9 |
|
||
2 | 2 | 4 | 4 |
|
||
2 | 2 | 4 | 5 |
|
||
2 | 2 | 4 | 6 |
|
||
2 | 2 | 4 | 7 |
|
||
2 | 2 | 4 | 8 |
|
||
2 | 2 | 4 | 9 |
|
||
2 | 2 | 5 | 6 |
|
||
2 | 2 | 5 | 7 |
|
||
2 | 2 | 5 | 8 |
|
||
2 | 2 | 5 | 9 |
|
||
2 | 2 | 6 | 7 |
|
||
2 | 2 | 6 | 8 |
|
||
2 | 2 | 6 | 9 |
|
||
2 | 2 | 7 | 8 |
|
||
2 | 2 | 7 | 9 |
|
||
2 | 2 | 8 | 9 |
|
||
2 | 2 | 9 | 9 |
|
||
2 | 3 | 3 | 3 |
|
||
2 | 3 | 3 | 4 |
|
||
2 | 3 | 3 | 5 |
|
||
2 | 3 | 3 | 6 |
|
||
2 | 3 | 3 | 7 |
|
||
2 | 3 | 3 | 8 |
|
||
2 | 3 | 3 | 9 |
|
||
2 | 3 | 4 | 4 |
|
||
2 | 3 | 4 | 5 |
|
||
2 | 3 | 4 | 6 |
|
||
2 | 3 | 4 | 7 |
|
||
2 | 3 | 4 | 8 |
|
||
2 | 3 | 4 | 9 |
|
||
2 | 3 | 5 | 5 |
|
||
2 | 3 | 5 | 6 |
|
||
2 | 3 | 5 | 7 |
|
||
2 | 3 | 5 | 8 |
|
||
2 | 3 | 5 | 9 |
|
||
2 | 3 | 6 | 7 |
|
||
2 | 3 | 6 | 9 |
|
||
2 | 3 | 7 | 7 |
|
||
2 | 3 | 7 | 8 |
|
||
2 | 3 | 7 | 9 |
|
||
2 | 3 | 8 | 8 |
|
||
2 | 3 | 8 | 9 |
|
||
2 | 3 | 9 | 9 |
|
||
2 | 4 | 4 | 4 |
|
||
2 | 4 | 4 | 5 |
|
||
2 | 4 | 4 | 6 |
|
||
2 | 4 | 4 | 7 |
|
||
2 | 4 | 4 | 8 |
|
||
2 | 4 | 4 | 9 |
|
||
2 | 4 | 5 | 5 |
|
||
2 | 4 | 5 | 6 |
|
||
2 | 4 | 5 | 7 |
|
||
2 | 4 | 5 | 8 |
|
||
2 | 4 | 5 | 9 |
|
||
2 | 4 | 6 | 6 |
|
||
2 | 4 | 6 | 7 |
|
||
2 | 4 | 6 | 8 |
|
||
2 | 4 | 6 | 9 |
|
||
2 | 4 | 7 | 7 |
|
||
2 | 4 | 7 | 8 |
|
||
2 | 4 | 7 | 9 |
|
||
2 | 4 | 8 | 8 |
|
||
2 | 4 | 8 | 9 |
|
||
2 | 4 | 9 | 9 |
|
||
2 | 5 | 5 | 6 |
|
||
2 | 5 | 5 | 7 |
|
||
2 | 5 | 5 | 8 |
|
||
2 | 5 | 5 | 9 |
|
||
2 | 5 | 6 | 6 |
|
||
2 | 5 | 6 | 7 |
|
||
2 | 5 | 6 | 8 |
|
||
2 | 5 | 6 | 9 |
|
||
2 | 5 | 7 | 7 |
|
||
2 | 5 | 7 | 8 |
|
||
2 | 5 | 7 | 9 |
|
||
2 | 5 | 8 | 9 |
|
||
2 | 5 | 9 | 9 |
|
||
2 | 6 | 6 | 9 |
|
||
2 | 6 | 7 | 8 |
|
||
2 | 6 | 7 | 9 |
|
||
2 | 6 | 8 | 8 |
|
||
2 | 6 | 8 | 9 |
|
||
2 | 6 | 9 | 9 |
|
||
2 | 7 | 7 | 9 |
|
||
2 | 7 | 8 | 8 |
|
||
2 | 7 | 8 | 9 |
|
||
2 | 7 | 9 | 9 |
|
||
2 | 8 | 8 | 9 |
|
||
2 | 8 | 9 | 9 |
|
||
2 | 9 | 9 | 9 |
|
||
3 | 3 | 3 | 4 |
|
||
3 | 3 | 3 | 8 |
|
||
3 | 3 | 3 | 9 |
|
||
3 | 3 | 4 | 4 |
|
||
3 | 3 | 4 | 5 |
|
||
3 | 3 | 4 | 6 |
|
||
3 | 3 | 4 | 7 |
|
||
3 | 3 | 4 | 8 |
|
||
3 | 3 | 4 | 9 |
|
||
3 | 3 | 5 | 5 |
|
||
3 | 3 | 5 | 6 |
|
||
3 | 3 | 5 | 7 |
|
||
3 | 3 | 5 | 8 |
|
||
3 | 3 | 5 | 9 |
|
||
3 | 3 | 6 | 8 |
|
||
3 | 3 | 6 | 9 |
|
||
3 | 3 | 7 | 8 |
|
||
3 | 3 | 7 | 9 |
|
||
3 | 3 | 8 | 8 |
|
||
3 | 3 | 8 | 9 |
|
||
3 | 3 | 9 | 9 |
|
||
3 | 4 | 4 | 4 |
|
||
3 | 4 | 4 | 5 |
|
||
3 | 4 | 4 | 6 |
|
||
3 | 4 | 4 | 7 |
|
||
3 | 4 | 4 | 8 |
|
||
3 | 4 | 4 | 9 |
|
||
3 | 4 | 5 | 5 |
|
||
3 | 4 | 5 | 6 |
|
||
3 | 4 | 5 | 7 |
|
||
3 | 4 | 5 | 8 |
|
||
3 | 4 | 5 | 9 |
|
||
3 | 4 | 6 | 6 |
|
||
3 | 4 | 6 | 7 |
|
||
3 | 4 | 6 | 8 |
|
||
3 | 4 | 6 | 9 |
|
||
3 | 4 | 7 | 7 |
|
||
3 | 4 | 7 | 8 |
|
||
3 | 4 | 7 | 9 |
|
||
3 | 4 | 8 | 8 |
|
||
3 | 4 | 8 | 9 |
|
||
3 | 4 | 9 | 9 |
|
||
3 | 5 | 5 | 6 |
|
||
3 | 5 | 5 | 7 |
|
||
3 | 5 | 5 | 8 |
|
||
3 | 5 | 5 | 9 |
|
||
3 | 5 | 6 | 6 |
|
||
3 | 5 | 6 | 7 |
|
||
3 | 5 | 6 | 8 |
|
||
3 | 5 | 6 | 9 |
|
||
3 | 5 | 7 | 7 |
|
||
3 | 5 | 7 | 8 |
|
||
3 | 5 | 7 | 9 |
|
||
3 | 5 | 8 | 8 |
|
||
3 | 5 | 8 | 9 |
|
||
3 | 5 | 9 | 9 |
|
||
3 | 6 | 6 | 8 |
|
||
3 | 6 | 6 | 9 |
|
||
3 | 6 | 7 | 8 |
|
||
3 | 6 | 7 | 9 |
|
||
3 | 6 | 8 | 8 |
|
||
3 | 6 | 8 | 9 |
|
||
3 | 6 | 9 | 9 |
|
||
3 | 7 | 7 | 8 |
|
||
3 | 7 | 7 | 9 |
|
||
3 | 7 | 8 | 8 |
|
||
3 | 7 | 8 | 9 |
|
||
3 | 7 | 9 | 9 |
|
||
3 | 8 | 8 | 9 |
|
||
3 | 8 | 9 | 9 |
|
||
3 | 9 | 9 | 9 |
|
||
4 | 4 | 4 | 4 |
|
||
4 | 4 | 4 | 5 |
|
||
4 | 4 | 4 | 6 |
|
||
4 | 4 | 4 | 7 |
|
||
4 | 4 | 4 | 8 |
|
||
4 | 4 | 4 | 9 |
|
||
4 | 4 | 5 | 5 |
|
||
4 | 4 | 5 | 6 |
|
||
4 | 4 | 5 | 7 |
|
||
4 | 4 | 5 | 8 |
|
||
4 | 4 | 5 | 9 |
|
||
4 | 4 | 6 | 6 |
|
||
4 | 4 | 6 | 7 |
|
||
4 | 4 | 6 | 8 |
|
||
4 | 4 | 6 | 9 |
|
||
4 | 4 | 7 | 7 |
|
||
4 | 4 | 7 | 8 |
|
||
4 | 4 | 7 | 9 |
|
||
4 | 4 | 8 | 8 |
|
||
4 | 4 | 8 | 9 |
|
||
4 | 4 | 9 | 9 |
|
||
4 | 5 | 5 | 5 |
|
||
4 | 5 | 5 | 6 |
|
||
4 | 5 | 5 | 7 |
|
||
4 | 5 | 5 | 8 |
|
||
4 | 5 | 5 | 9 |
|
||
4 | 5 | 6 | 6 |
|
||
4 | 5 | 6 | 7 |
|
||
4 | 5 | 6 | 8 |
|
||
4 | 5 | 6 | 9 |
|
||
4 | 5 | 7 | 7 |
|
||
4 | 5 | 7 | 8 |
|
||
4 | 5 | 7 | 9 |
|
||
4 | 5 | 8 | 8 |
|
||
4 | 5 | 8 | 9 |
|
||
4 | 5 | 9 | 9 |
|
||
4 | 6 | 6 | 7 |
|
||
4 | 6 | 6 | 8 |
|
||
4 | 6 | 6 | 9 |
|
||
4 | 6 | 7 | 8 |
|
||
4 | 6 | 7 | 9 |
|
||
4 | 6 | 8 | 9 |
|
||
4 | 6 | 9 | 9 |
|
||
4 | 7 | 7 | 8 |
|
||
4 | 7 | 7 | 9 |
|
||
4 | 7 | 8 | 8 |
|
||
4 | 7 | 8 | 9 |
|
||
4 | 7 | 9 | 9 |
|
||
4 | 8 | 8 | 8 |
|
||
4 | 8 | 8 | 9 |
|
||
4 | 8 | 9 | 9 |
|
||
4 | 9 | 9 | 9 |
|
||
5 | 5 | 5 | 6 |
|
||
5 | 5 | 5 | 7 |
|
||
5 | 5 | 5 | 8 |
|
||
5 | 5 | 5 | 9 |
|
||
5 | 5 | 6 | 7 |
|
||
5 | 5 | 6 | 8 |
|
||
5 | 5 | 6 | 9 |
|
||
5 | 5 | 7 | 8 |
|
||
5 | 5 | 7 | 9 |
|
||
5 | 5 | 8 | 8 |
|
||
5 | 5 | 8 | 9 |
|
||
5 | 5 | 9 | 9 |
|
||
5 | 6 | 6 | 7 |
|
||
5 | 6 | 6 | 8 |
|
||
5 | 6 | 6 | 9 |
|
||
5 | 6 | 7 | 7 |
|
||
5 | 6 | 7 | 8 |
|
||
5 | 6 | 7 | 9 |
|
||
5 | 6 | 8 | 8 |
|
||
5 | 6 | 8 | 9 |
|
||
5 | 6 | 9 | 9 |
|
||
5 | 7 | 7 | 8 |
|
||
5 | 7 | 7 | 9 |
|
||
5 | 7 | 8 | 8 |
|
||
5 | 7 | 8 | 9 |
|
||
5 | 7 | 9 | 9 |
|
||
5 | 8 | 8 | 8 |
|
||
5 | 8 | 8 | 9 |
|
||
5 | 8 | 9 | 9 |
|
||
5 | 9 | 9 | 9 |
|
||
6 | 6 | 7 | 8 |
|
||
6 | 6 | 7 | 9 |
|
||
6 | 6 | 8 | 9 |
|
||
6 | 7 | 7 | 8 |
|
||
6 | 7 | 7 | 9 |
|
||
6 | 7 | 8 | 8 |
|
||
6 | 7 | 8 | 9 |
|
||
6 | 7 | 9 | 9 |
|
||
6 | 8 | 8 | 9 |
|
||
6 | 8 | 9 | 9 |
|
||
6 | 9 | 9 | 9 |
|
||
7 | 7 | 8 | 9 |
|
||
7 | 7 | 9 | 9 |
|
||
7 | 8 | 8 | 9 |
|
||
7 | 8 | 9 | 9 |
|
||
7 | 9 | 9 | 9 |
|
||
8 | 8 | 9 | 9 |
|
||
8 | 9 | 9 | 9 |
|
||
9 | 9 | 9 | 9 |
|
Hugh
Quote from dandl on February 15, 2020, 11:22 pmQuote from Hugh on February 15, 2020, 1:10 pmQuote from dandl on February 15, 2020, 10:34 amSome time back we had some discussion around this topic. The question is how best to deal with queries where particular tuples are 'projected out' or 'joined out', but we want those lost tuples to contribute to the final result. SQL allows preserving the rows but filling in with NULLs; we don't allow that.
What I recall was broadly two main strategies. I had a feeling there were more, but I can't find them.
- Bundle up attributes that might go missing into an RVA, so missing data can be represented as an empty relation.
- Back fill the final query with the missing rows and pre-computed data.
Strategy 1 covers image relations. It also covers SUMMARIZE PER, which generates a temporary RVA behind the scenes. You can see this directly in the expansion offered by DBE. This one suffers from the difficulty that the aggregating function is required to return an acceptable value when passed an empty relation (what's the AVERAGE value of an empty argument list?). [SQL has a slightly different problem: the aggregating function may have to deal with NULLs.]
Strategy 2 is awkward to code, but gets the right result and is immune to the empty relation problem. I've seen variants of this used in SQL.
But are there others? I thought there were, but memory fails me.
As a side note, App-A glosses over the issue of SUMMARIZE, but again whatever the relcon looks like, it must have an RVA as per Strategy 1.
Re strategy 1, I don't really understand "bundle up attributes that might go missing" but in any case I can't see how you think this solves the problem with the likes of AVG.
In the Supplier database, if you join S to P via SP, supplier S5 will be 'joined out' because there is no S5 in SP. A naive aggregation of QTY will omit S5, when our desired result is zero. SUMMARIZE PER constructs an RVA behind the scenes associating S5 with an empty list of parts supplied. The aggregation comes out right provided the aggregating function delivers the desired result when passed an empty list. In the image relation approach in DTATRM, that use of an RVA is explicit.
And no, it doesn't solve the problem of AVG unless AVG is defined to return a value (such as zero or NaN) for an empty list.
Re strategy 2, what do you mean by "right answer"?
It seems to me that all cases need the use of r1 UNION r2 in which one of the operands (r2, say) is an ad hoc arrangement generating tuples of the same type as those of r1.
Agreed. That's exactly what I meant by Strategy 2. This can solve the AVG problem by inserting a preset value.
Here's my attempt to grapple with the AVG problem:
(EXTEND (SUMMARIZE r1 PER(r2 MATCHING r1) : {x := AVG(y)}) : {rx := rel{tup{x x}}}){ALL BUT x}
UNION
EXTEND (r2 NOT MATCHING r1) : {rx := REL{x RAT}{}}Is that the kind of thing you meant?
I would expect SUMMARIZE PER to deliver the desired result on its own. In this construct I would expect to see SUMMARIZE BY with NOT MATCHING, EXTEND, UNION.
Quote from Hugh on February 15, 2020, 1:10 pmQuote from dandl on February 15, 2020, 10:34 amSome time back we had some discussion around this topic. The question is how best to deal with queries where particular tuples are 'projected out' or 'joined out', but we want those lost tuples to contribute to the final result. SQL allows preserving the rows but filling in with NULLs; we don't allow that.
What I recall was broadly two main strategies. I had a feeling there were more, but I can't find them.
- Bundle up attributes that might go missing into an RVA, so missing data can be represented as an empty relation.
- Back fill the final query with the missing rows and pre-computed data.
Strategy 1 covers image relations. It also covers SUMMARIZE PER, which generates a temporary RVA behind the scenes. You can see this directly in the expansion offered by DBE. This one suffers from the difficulty that the aggregating function is required to return an acceptable value when passed an empty relation (what's the AVERAGE value of an empty argument list?). [SQL has a slightly different problem: the aggregating function may have to deal with NULLs.]
Strategy 2 is awkward to code, but gets the right result and is immune to the empty relation problem. I've seen variants of this used in SQL.
But are there others? I thought there were, but memory fails me.
As a side note, App-A glosses over the issue of SUMMARIZE, but again whatever the relcon looks like, it must have an RVA as per Strategy 1.
Re strategy 1, I don't really understand "bundle up attributes that might go missing" but in any case I can't see how you think this solves the problem with the likes of AVG.
In the Supplier database, if you join S to P via SP, supplier S5 will be 'joined out' because there is no S5 in SP. A naive aggregation of QTY will omit S5, when our desired result is zero. SUMMARIZE PER constructs an RVA behind the scenes associating S5 with an empty list of parts supplied. The aggregation comes out right provided the aggregating function delivers the desired result when passed an empty list. In the image relation approach in DTATRM, that use of an RVA is explicit.
And no, it doesn't solve the problem of AVG unless AVG is defined to return a value (such as zero or NaN) for an empty list.
Re strategy 2, what do you mean by "right answer"?
It seems to me that all cases need the use of r1 UNION r2 in which one of the operands (r2, say) is an ad hoc arrangement generating tuples of the same type as those of r1.
Agreed. That's exactly what I meant by Strategy 2. This can solve the AVG problem by inserting a preset value.
Here's my attempt to grapple with the AVG problem:
(EXTEND (SUMMARIZE r1 PER(r2 MATCHING r1) : {x := AVG(y)}) : {rx := rel{tup{x x}}}){ALL BUT x}
UNION
EXTEND (r2 NOT MATCHING r1) : {rx := REL{x RAT}{}}Is that the kind of thing you meant?
I would expect SUMMARIZE PER to deliver the desired result on its own. In this construct I would expect to see SUMMARIZE BY with NOT MATCHING, EXTEND, UNION.
Quote from Dave Voorhis on February 15, 2020, 11:47 pmQuote from dandl on February 15, 2020, 11:22 pmQuote from Hugh on February 15, 2020, 1:10 pmQuote from dandl on February 15, 2020, 10:34 amSome time back we had some discussion around this topic. The question is how best to deal with queries where particular tuples are 'projected out' or 'joined out', but we want those lost tuples to contribute to the final result. SQL allows preserving the rows but filling in with NULLs; we don't allow that.
What I recall was broadly two main strategies. I had a feeling there were more, but I can't find them.
- Bundle up attributes that might go missing into an RVA, so missing data can be represented as an empty relation.
- Back fill the final query with the missing rows and pre-computed data.
Strategy 1 covers image relations. It also covers SUMMARIZE PER, which generates a temporary RVA behind the scenes. You can see this directly in the expansion offered by DBE. This one suffers from the difficulty that the aggregating function is required to return an acceptable value when passed an empty relation (what's the AVERAGE value of an empty argument list?). [SQL has a slightly different problem: the aggregating function may have to deal with NULLs.]
Strategy 2 is awkward to code, but gets the right result and is immune to the empty relation problem. I've seen variants of this used in SQL.
But are there others? I thought there were, but memory fails me.
As a side note, App-A glosses over the issue of SUMMARIZE, but again whatever the relcon looks like, it must have an RVA as per Strategy 1.
Re strategy 1, I don't really understand "bundle up attributes that might go missing" but in any case I can't see how you think this solves the problem with the likes of AVG.
In the Supplier database, if you join S to P via SP, supplier S5 will be 'joined out' because there is no S5 in SP. A naive aggregation of QTY will omit S5, when our desired result is zero. SUMMARIZE PER constructs an RVA behind the scenes associating S5 with an empty list of parts supplied. The aggregation comes out right provided the aggregating function delivers the desired result when passed an empty list. In the image relation approach in DTATRM, that use of an RVA is explicit.
And no, it doesn't solve the problem of AVG unless AVG is defined to return a value (such as zero or NaN) for an empty list.
Zero? Zero?
NaN, sure. Undefined, where the type is a union of RATIONAL and Undefined, sure.
But zero?!?
Abomination.
Quote from dandl on February 15, 2020, 11:22 pmQuote from Hugh on February 15, 2020, 1:10 pmQuote from dandl on February 15, 2020, 10:34 amSome time back we had some discussion around this topic. The question is how best to deal with queries where particular tuples are 'projected out' or 'joined out', but we want those lost tuples to contribute to the final result. SQL allows preserving the rows but filling in with NULLs; we don't allow that.
What I recall was broadly two main strategies. I had a feeling there were more, but I can't find them.
- Bundle up attributes that might go missing into an RVA, so missing data can be represented as an empty relation.
- Back fill the final query with the missing rows and pre-computed data.
Strategy 1 covers image relations. It also covers SUMMARIZE PER, which generates a temporary RVA behind the scenes. You can see this directly in the expansion offered by DBE. This one suffers from the difficulty that the aggregating function is required to return an acceptable value when passed an empty relation (what's the AVERAGE value of an empty argument list?). [SQL has a slightly different problem: the aggregating function may have to deal with NULLs.]
Strategy 2 is awkward to code, but gets the right result and is immune to the empty relation problem. I've seen variants of this used in SQL.
But are there others? I thought there were, but memory fails me.
As a side note, App-A glosses over the issue of SUMMARIZE, but again whatever the relcon looks like, it must have an RVA as per Strategy 1.
Re strategy 1, I don't really understand "bundle up attributes that might go missing" but in any case I can't see how you think this solves the problem with the likes of AVG.
In the Supplier database, if you join S to P via SP, supplier S5 will be 'joined out' because there is no S5 in SP. A naive aggregation of QTY will omit S5, when our desired result is zero. SUMMARIZE PER constructs an RVA behind the scenes associating S5 with an empty list of parts supplied. The aggregation comes out right provided the aggregating function delivers the desired result when passed an empty list. In the image relation approach in DTATRM, that use of an RVA is explicit.
And no, it doesn't solve the problem of AVG unless AVG is defined to return a value (such as zero or NaN) for an empty list.
Zero? Zero?
NaN, sure. Undefined, where the type is a union of RATIONAL and Undefined, sure.
But zero?!?
Abomination.
Quote from dandl on February 16, 2020, 12:44 amQuote from Dave Voorhis on February 15, 2020, 11:47 pmQuote from dandl on February 15, 2020, 11:22 pmQuote from Hugh on February 15, 2020, 1:10 pmQuote from dandl on February 15, 2020, 10:34 amSome time back we had some discussion around this topic. The question is how best to deal with queries where particular tuples are 'projected out' or 'joined out', but we want those lost tuples to contribute to the final result. SQL allows preserving the rows but filling in with NULLs; we don't allow that.
What I recall was broadly two main strategies. I had a feeling there were more, but I can't find them.
- Bundle up attributes that might go missing into an RVA, so missing data can be represented as an empty relation.
- Back fill the final query with the missing rows and pre-computed data.
Strategy 1 covers image relations. It also covers SUMMARIZE PER, which generates a temporary RVA behind the scenes. You can see this directly in the expansion offered by DBE. This one suffers from the difficulty that the aggregating function is required to return an acceptable value when passed an empty relation (what's the AVERAGE value of an empty argument list?). [SQL has a slightly different problem: the aggregating function may have to deal with NULLs.]
Strategy 2 is awkward to code, but gets the right result and is immune to the empty relation problem. I've seen variants of this used in SQL.
But are there others? I thought there were, but memory fails me.
As a side note, App-A glosses over the issue of SUMMARIZE, but again whatever the relcon looks like, it must have an RVA as per Strategy 1.
Re strategy 1, I don't really understand "bundle up attributes that might go missing" but in any case I can't see how you think this solves the problem with the likes of AVG.
In the Supplier database, if you join S to P via SP, supplier S5 will be 'joined out' because there is no S5 in SP. A naive aggregation of QTY will omit S5, when our desired result is zero. SUMMARIZE PER constructs an RVA behind the scenes associating S5 with an empty list of parts supplied. The aggregation comes out right provided the aggregating function delivers the desired result when passed an empty list. In the image relation approach in DTATRM, that use of an RVA is explicit.
And no, it doesn't solve the problem of AVG unless AVG is defined to return a value (such as zero or NaN) for an empty list.
Zero? Zero?
NaN, sure. Undefined, where the type is a union of RATIONAL and Undefined, sure.
But zero?!?
Abomination.
Not my problem, but at least the choice is up front.
But I still don't have an answer: apart from temp RVA and back fill UNION, is there another strategy for doing what SQL might handle with an outer join? I thought we discussed one, but I can't find anything.
Maybe it wasn't aggregation? DIVIDE? Should supplier S5 be treated differently in a query about "suppliers who supply every purple part"? The strategy in the PER form is just to take a different path (via S rather than SP).
I have a note to myself about 'outer join for substitution' but I don't know what it refers to. Any idea?
Quote from Dave Voorhis on February 15, 2020, 11:47 pmQuote from dandl on February 15, 2020, 11:22 pmQuote from Hugh on February 15, 2020, 1:10 pmQuote from dandl on February 15, 2020, 10:34 amSome time back we had some discussion around this topic. The question is how best to deal with queries where particular tuples are 'projected out' or 'joined out', but we want those lost tuples to contribute to the final result. SQL allows preserving the rows but filling in with NULLs; we don't allow that.
What I recall was broadly two main strategies. I had a feeling there were more, but I can't find them.
- Bundle up attributes that might go missing into an RVA, so missing data can be represented as an empty relation.
- Back fill the final query with the missing rows and pre-computed data.
Strategy 1 covers image relations. It also covers SUMMARIZE PER, which generates a temporary RVA behind the scenes. You can see this directly in the expansion offered by DBE. This one suffers from the difficulty that the aggregating function is required to return an acceptable value when passed an empty relation (what's the AVERAGE value of an empty argument list?). [SQL has a slightly different problem: the aggregating function may have to deal with NULLs.]
Strategy 2 is awkward to code, but gets the right result and is immune to the empty relation problem. I've seen variants of this used in SQL.
But are there others? I thought there were, but memory fails me.
As a side note, App-A glosses over the issue of SUMMARIZE, but again whatever the relcon looks like, it must have an RVA as per Strategy 1.
Re strategy 1, I don't really understand "bundle up attributes that might go missing" but in any case I can't see how you think this solves the problem with the likes of AVG.
In the Supplier database, if you join S to P via SP, supplier S5 will be 'joined out' because there is no S5 in SP. A naive aggregation of QTY will omit S5, when our desired result is zero. SUMMARIZE PER constructs an RVA behind the scenes associating S5 with an empty list of parts supplied. The aggregation comes out right provided the aggregating function delivers the desired result when passed an empty list. In the image relation approach in DTATRM, that use of an RVA is explicit.
And no, it doesn't solve the problem of AVG unless AVG is defined to return a value (such as zero or NaN) for an empty list.
Zero? Zero?
NaN, sure. Undefined, where the type is a union of RATIONAL and Undefined, sure.
But zero?!?
Abomination.
Not my problem, but at least the choice is up front.
But I still don't have an answer: apart from temp RVA and back fill UNION, is there another strategy for doing what SQL might handle with an outer join? I thought we discussed one, but I can't find anything.
Maybe it wasn't aggregation? DIVIDE? Should supplier S5 be treated differently in a query about "suppliers who supply every purple part"? The strategy in the PER form is just to take a different path (via S rather than SP).
I have a note to myself about 'outer join for substitution' but I don't know what it refers to. Any idea?
Quote from Dave Voorhis on February 16, 2020, 9:38 amQuote from dandl on February 16, 2020, 12:44 amQuote from Dave Voorhis on February 15, 2020, 11:47 pmQuote from dandl on February 15, 2020, 11:22 pmQuote from Hugh on February 15, 2020, 1:10 pmQuote from dandl on February 15, 2020, 10:34 amSome time back we had some discussion around this topic. The question is how best to deal with queries where particular tuples are 'projected out' or 'joined out', but we want those lost tuples to contribute to the final result. SQL allows preserving the rows but filling in with NULLs; we don't allow that.
What I recall was broadly two main strategies. I had a feeling there were more, but I can't find them.
- Bundle up attributes that might go missing into an RVA, so missing data can be represented as an empty relation.
- Back fill the final query with the missing rows and pre-computed data.
Strategy 1 covers image relations. It also covers SUMMARIZE PER, which generates a temporary RVA behind the scenes. You can see this directly in the expansion offered by DBE. This one suffers from the difficulty that the aggregating function is required to return an acceptable value when passed an empty relation (what's the AVERAGE value of an empty argument list?). [SQL has a slightly different problem: the aggregating function may have to deal with NULLs.]
Strategy 2 is awkward to code, but gets the right result and is immune to the empty relation problem. I've seen variants of this used in SQL.
But are there others? I thought there were, but memory fails me.
As a side note, App-A glosses over the issue of SUMMARIZE, but again whatever the relcon looks like, it must have an RVA as per Strategy 1.
Re strategy 1, I don't really understand "bundle up attributes that might go missing" but in any case I can't see how you think this solves the problem with the likes of AVG.
In the Supplier database, if you join S to P via SP, supplier S5 will be 'joined out' because there is no S5 in SP. A naive aggregation of QTY will omit S5, when our desired result is zero. SUMMARIZE PER constructs an RVA behind the scenes associating S5 with an empty list of parts supplied. The aggregation comes out right provided the aggregating function delivers the desired result when passed an empty list. In the image relation approach in DTATRM, that use of an RVA is explicit.
And no, it doesn't solve the problem of AVG unless AVG is defined to return a value (such as zero or NaN) for an empty list.
Zero? Zero?
NaN, sure. Undefined, where the type is a union of RATIONAL and Undefined, sure.
But zero?!?
Abomination.
Not my problem, but at least the choice is up front.
But I still don't have an answer: apart from temp RVA and back fill UNION, is there another strategy for doing what SQL might handle with an outer join? I thought we discussed one, but I can't find anything.
Maybe it wasn't aggregation? DIVIDE? Should supplier S5 be treated differently in a query about "suppliers who supply every purple part"? The strategy in the PER form is just to take a different path (via S rather than SP).
I have a note to myself about 'outer join for substitution' but I don't know what it refers to. Any idea?
When this came up before, I think it's where we deviated into a discussion about how outer join could be implemented in Tutorial D.
I've always used a union of {the tuples with x → result, the tuples without x → indicator } where result & indicator are either an RVA, or a union type of result | indicator.
Quote from dandl on February 16, 2020, 12:44 amQuote from Dave Voorhis on February 15, 2020, 11:47 pmQuote from dandl on February 15, 2020, 11:22 pmQuote from Hugh on February 15, 2020, 1:10 pmQuote from dandl on February 15, 2020, 10:34 amSome time back we had some discussion around this topic. The question is how best to deal with queries where particular tuples are 'projected out' or 'joined out', but we want those lost tuples to contribute to the final result. SQL allows preserving the rows but filling in with NULLs; we don't allow that.
What I recall was broadly two main strategies. I had a feeling there were more, but I can't find them.
- Bundle up attributes that might go missing into an RVA, so missing data can be represented as an empty relation.
- Back fill the final query with the missing rows and pre-computed data.
Strategy 1 covers image relations. It also covers SUMMARIZE PER, which generates a temporary RVA behind the scenes. You can see this directly in the expansion offered by DBE. This one suffers from the difficulty that the aggregating function is required to return an acceptable value when passed an empty relation (what's the AVERAGE value of an empty argument list?). [SQL has a slightly different problem: the aggregating function may have to deal with NULLs.]
Strategy 2 is awkward to code, but gets the right result and is immune to the empty relation problem. I've seen variants of this used in SQL.
But are there others? I thought there were, but memory fails me.
As a side note, App-A glosses over the issue of SUMMARIZE, but again whatever the relcon looks like, it must have an RVA as per Strategy 1.
Re strategy 1, I don't really understand "bundle up attributes that might go missing" but in any case I can't see how you think this solves the problem with the likes of AVG.
In the Supplier database, if you join S to P via SP, supplier S5 will be 'joined out' because there is no S5 in SP. A naive aggregation of QTY will omit S5, when our desired result is zero. SUMMARIZE PER constructs an RVA behind the scenes associating S5 with an empty list of parts supplied. The aggregation comes out right provided the aggregating function delivers the desired result when passed an empty list. In the image relation approach in DTATRM, that use of an RVA is explicit.
And no, it doesn't solve the problem of AVG unless AVG is defined to return a value (such as zero or NaN) for an empty list.
Zero? Zero?
NaN, sure. Undefined, where the type is a union of RATIONAL and Undefined, sure.
But zero?!?
Abomination.
Not my problem, but at least the choice is up front.
But I still don't have an answer: apart from temp RVA and back fill UNION, is there another strategy for doing what SQL might handle with an outer join? I thought we discussed one, but I can't find anything.
Maybe it wasn't aggregation? DIVIDE? Should supplier S5 be treated differently in a query about "suppliers who supply every purple part"? The strategy in the PER form is just to take a different path (via S rather than SP).
I have a note to myself about 'outer join for substitution' but I don't know what it refers to. Any idea?
When this came up before, I think it's where we deviated into a discussion about how outer join could be implemented in Tutorial D.
I've always used a union of {the tuples with x → result, the tuples without x → indicator } where result & indicator are either an RVA, or a union type of result | indicator.
Quote from Hugh on February 16, 2020, 12:11 pmQuote from dandl on February 15, 2020, 11:22 pmQuote from Hugh on February 15, 2020, 1:10 pmQuote from dandl on February 15, 2020, 10:34 amSome time back we had some discussion around this topic. The question is how best to deal with queries where particular tuples are 'projected out' or 'joined out', but we want those lost tuples to contribute to the final result. SQL allows preserving the rows but filling in with NULLs; we don't allow that.
What I recall was broadly two main strategies. I had a feeling there were more, but I can't find them.
- Bundle up attributes that might go missing into an RVA, so missing data can be represented as an empty relation.
- Back fill the final query with the missing rows and pre-computed data.
Strategy 1 covers image relations. It also covers SUMMARIZE PER, which generates a temporary RVA behind the scenes. You can see this directly in the expansion offered by DBE. This one suffers from the difficulty that the aggregating function is required to return an acceptable value when passed an empty relation (what's the AVERAGE value of an empty argument list?). [SQL has a slightly different problem: the aggregating function may have to deal with NULLs.]
Strategy 2 is awkward to code, but gets the right result and is immune to the empty relation problem. I've seen variants of this used in SQL.
But are there others? I thought there were, but memory fails me.
As a side note, App-A glosses over the issue of SUMMARIZE, but again whatever the relcon looks like, it must have an RVA as per Strategy 1.
Re strategy 1, I don't really understand "bundle up attributes that might go missing" but in any case I can't see how you think this solves the problem with the likes of AVG.
In the Supplier database, if you join S to P via SP, supplier S5 will be 'joined out' because there is no S5 in SP. A naive aggregation of QTY will omit S5, when our desired result is zero. SUMMARIZE PER constructs an RVA behind the scenes associating S5 with an empty list of parts supplied. The aggregation comes out right provided the aggregating function delivers the desired result when passed an empty list. In the image relation approach in DTATRM, that use of an RVA is explicit.
And no, it doesn't solve the problem of AVG unless AVG is defined to return a value (such as zero or NaN) for an empty list.
Re strategy 2, what do you mean by "right answer"?
It seems to me that all cases need the use of r1 UNION r2 in which one of the operands (r2, say) is an ad hoc arrangement generating tuples of the same type as those of r1.
Agreed. That's exactly what I meant by Strategy 2. This can solve the AVG problem by inserting a preset value.
Here's my attempt to grapple with the AVG problem:
(EXTEND (SUMMARIZE r1 PER(r2 MATCHING r1) : {x := AVG(y)}) : {rx := rel{tup{x x}}}){ALL BUT x}
UNION
EXTEND (r2 NOT MATCHING r1) : {rx := REL{x RAT}{}}Is that the kind of thing you meant?
I would expect SUMMARIZE PER to deliver the desired result on its own. In this construct I would expect to see SUMMARIZE BY with NOT MATCHING, EXTEND, UNION.
Please explain how you expect SUMMARIZE PER to do that on its own.
Yes, I could have use BY in the first operand but you were concerned with the PER variety and I wanted a more general solution.
Hugh
Quote from dandl on February 15, 2020, 11:22 pmQuote from Hugh on February 15, 2020, 1:10 pmQuote from dandl on February 15, 2020, 10:34 amSome time back we had some discussion around this topic. The question is how best to deal with queries where particular tuples are 'projected out' or 'joined out', but we want those lost tuples to contribute to the final result. SQL allows preserving the rows but filling in with NULLs; we don't allow that.
What I recall was broadly two main strategies. I had a feeling there were more, but I can't find them.
- Bundle up attributes that might go missing into an RVA, so missing data can be represented as an empty relation.
- Back fill the final query with the missing rows and pre-computed data.
Strategy 1 covers image relations. It also covers SUMMARIZE PER, which generates a temporary RVA behind the scenes. You can see this directly in the expansion offered by DBE. This one suffers from the difficulty that the aggregating function is required to return an acceptable value when passed an empty relation (what's the AVERAGE value of an empty argument list?). [SQL has a slightly different problem: the aggregating function may have to deal with NULLs.]
Strategy 2 is awkward to code, but gets the right result and is immune to the empty relation problem. I've seen variants of this used in SQL.
But are there others? I thought there were, but memory fails me.
As a side note, App-A glosses over the issue of SUMMARIZE, but again whatever the relcon looks like, it must have an RVA as per Strategy 1.
Re strategy 1, I don't really understand "bundle up attributes that might go missing" but in any case I can't see how you think this solves the problem with the likes of AVG.
In the Supplier database, if you join S to P via SP, supplier S5 will be 'joined out' because there is no S5 in SP. A naive aggregation of QTY will omit S5, when our desired result is zero. SUMMARIZE PER constructs an RVA behind the scenes associating S5 with an empty list of parts supplied. The aggregation comes out right provided the aggregating function delivers the desired result when passed an empty list. In the image relation approach in DTATRM, that use of an RVA is explicit.
And no, it doesn't solve the problem of AVG unless AVG is defined to return a value (such as zero or NaN) for an empty list.
Re strategy 2, what do you mean by "right answer"?
It seems to me that all cases need the use of r1 UNION r2 in which one of the operands (r2, say) is an ad hoc arrangement generating tuples of the same type as those of r1.
Agreed. That's exactly what I meant by Strategy 2. This can solve the AVG problem by inserting a preset value.
Here's my attempt to grapple with the AVG problem:
(EXTEND (SUMMARIZE r1 PER(r2 MATCHING r1) : {x := AVG(y)}) : {rx := rel{tup{x x}}}){ALL BUT x}
UNION
EXTEND (r2 NOT MATCHING r1) : {rx := REL{x RAT}{}}Is that the kind of thing you meant?
I would expect SUMMARIZE PER to deliver the desired result on its own. In this construct I would expect to see SUMMARIZE BY with NOT MATCHING, EXTEND, UNION.
Please explain how you expect SUMMARIZE PER to do that on its own.
Yes, I could have use BY in the first operand but you were concerned with the PER variety and I wanted a more general solution.
Hugh
Quote from Dave Voorhis on February 16, 2020, 3:52 pmQuote from Dave Voorhis on February 16, 2020, 9:38 amQuote from dandl on February 16, 2020, 12:44 amQuote from Dave Voorhis on February 15, 2020, 11:47 pmQuote from dandl on February 15, 2020, 11:22 pmQuote from Hugh on February 15, 2020, 1:10 pmQuote from dandl on February 15, 2020, 10:34 amSome time back we had some discussion around this topic. The question is how best to deal with queries where particular tuples are 'projected out' or 'joined out', but we want those lost tuples to contribute to the final result. SQL allows preserving the rows but filling in with NULLs; we don't allow that.
What I recall was broadly two main strategies. I had a feeling there were more, but I can't find them.
- Bundle up attributes that might go missing into an RVA, so missing data can be represented as an empty relation.
- Back fill the final query with the missing rows and pre-computed data.
Strategy 1 covers image relations. It also covers SUMMARIZE PER, which generates a temporary RVA behind the scenes. You can see this directly in the expansion offered by DBE. This one suffers from the difficulty that the aggregating function is required to return an acceptable value when passed an empty relation (what's the AVERAGE value of an empty argument list?). [SQL has a slightly different problem: the aggregating function may have to deal with NULLs.]
Strategy 2 is awkward to code, but gets the right result and is immune to the empty relation problem. I've seen variants of this used in SQL.
But are there others? I thought there were, but memory fails me.
As a side note, App-A glosses over the issue of SUMMARIZE, but again whatever the relcon looks like, it must have an RVA as per Strategy 1.
Re strategy 1, I don't really understand "bundle up attributes that might go missing" but in any case I can't see how you think this solves the problem with the likes of AVG.
In the Supplier database, if you join S to P via SP, supplier S5 will be 'joined out' because there is no S5 in SP. A naive aggregation of QTY will omit S5, when our desired result is zero. SUMMARIZE PER constructs an RVA behind the scenes associating S5 with an empty list of parts supplied. The aggregation comes out right provided the aggregating function delivers the desired result when passed an empty list. In the image relation approach in DTATRM, that use of an RVA is explicit.
And no, it doesn't solve the problem of AVG unless AVG is defined to return a value (such as zero or NaN) for an empty list.
Zero? Zero?
NaN, sure. Undefined, where the type is a union of RATIONAL and Undefined, sure.
But zero?!?
Abomination.
Not my problem, but at least the choice is up front.
But I still don't have an answer: apart from temp RVA and back fill UNION, is there another strategy for doing what SQL might handle with an outer join? I thought we discussed one, but I can't find anything.
Maybe it wasn't aggregation? DIVIDE? Should supplier S5 be treated differently in a query about "suppliers who supply every purple part"? The strategy in the PER form is just to take a different path (via S rather than SP).
I have a note to myself about 'outer join for substitution' but I don't know what it refers to. Any idea?
When this came up before, I think it's where we deviated into a discussion about how outer join could be implemented in Tutorial D.
I've always used a union of {the tuples with x → result, the tuples without x → indicator } where result & indicator are either an RVA, or a union type of result | indicator.
Also... There's a natural inclination to try to meaningfully combine valid aggregate results, like averages, with aggregate results where a result is meaningless -- like an average of nothing.
Perhaps it shouldn't be a natural inclination.
When I seem to be heading in that direction, I reexamine the requirements that lead to trying to combine meaningful results with meaningless results. The result is often a new UX design that completely separates the average-ables from the non-average-ables. That often turns out to be much more pleasant to use, because it emphasises a natural distinction between the two groups that is otherwise undesirably blurred by combining them.
Quote from Dave Voorhis on February 16, 2020, 9:38 amQuote from dandl on February 16, 2020, 12:44 amQuote from Dave Voorhis on February 15, 2020, 11:47 pmQuote from dandl on February 15, 2020, 11:22 pmQuote from Hugh on February 15, 2020, 1:10 pmQuote from dandl on February 15, 2020, 10:34 amSome time back we had some discussion around this topic. The question is how best to deal with queries where particular tuples are 'projected out' or 'joined out', but we want those lost tuples to contribute to the final result. SQL allows preserving the rows but filling in with NULLs; we don't allow that.
What I recall was broadly two main strategies. I had a feeling there were more, but I can't find them.
- Bundle up attributes that might go missing into an RVA, so missing data can be represented as an empty relation.
- Back fill the final query with the missing rows and pre-computed data.
Strategy 1 covers image relations. It also covers SUMMARIZE PER, which generates a temporary RVA behind the scenes. You can see this directly in the expansion offered by DBE. This one suffers from the difficulty that the aggregating function is required to return an acceptable value when passed an empty relation (what's the AVERAGE value of an empty argument list?). [SQL has a slightly different problem: the aggregating function may have to deal with NULLs.]
Strategy 2 is awkward to code, but gets the right result and is immune to the empty relation problem. I've seen variants of this used in SQL.
But are there others? I thought there were, but memory fails me.
As a side note, App-A glosses over the issue of SUMMARIZE, but again whatever the relcon looks like, it must have an RVA as per Strategy 1.
Re strategy 1, I don't really understand "bundle up attributes that might go missing" but in any case I can't see how you think this solves the problem with the likes of AVG.
In the Supplier database, if you join S to P via SP, supplier S5 will be 'joined out' because there is no S5 in SP. A naive aggregation of QTY will omit S5, when our desired result is zero. SUMMARIZE PER constructs an RVA behind the scenes associating S5 with an empty list of parts supplied. The aggregation comes out right provided the aggregating function delivers the desired result when passed an empty list. In the image relation approach in DTATRM, that use of an RVA is explicit.
And no, it doesn't solve the problem of AVG unless AVG is defined to return a value (such as zero or NaN) for an empty list.
Zero? Zero?
NaN, sure. Undefined, where the type is a union of RATIONAL and Undefined, sure.
But zero?!?
Abomination.
Not my problem, but at least the choice is up front.
But I still don't have an answer: apart from temp RVA and back fill UNION, is there another strategy for doing what SQL might handle with an outer join? I thought we discussed one, but I can't find anything.
Maybe it wasn't aggregation? DIVIDE? Should supplier S5 be treated differently in a query about "suppliers who supply every purple part"? The strategy in the PER form is just to take a different path (via S rather than SP).
I have a note to myself about 'outer join for substitution' but I don't know what it refers to. Any idea?
When this came up before, I think it's where we deviated into a discussion about how outer join could be implemented in Tutorial D.
I've always used a union of {the tuples with x → result, the tuples without x → indicator } where result & indicator are either an RVA, or a union type of result | indicator.
Also... There's a natural inclination to try to meaningfully combine valid aggregate results, like averages, with aggregate results where a result is meaningless -- like an average of nothing.
Perhaps it shouldn't be a natural inclination.
When I seem to be heading in that direction, I reexamine the requirements that lead to trying to combine meaningful results with meaningless results. The result is often a new UX design that completely separates the average-ables from the non-average-ables. That often turns out to be much more pleasant to use, because it emphasises a natural distinction between the two groups that is otherwise undesirably blurred by combining them.
Quote from dandl on February 16, 2020, 11:39 pmHere's my attempt to grapple with the AVG problem:
(EXTEND (SUMMARIZE r1 PER(r2 MATCHING r1) : {x := AVG(y)}) : {rx := rel{tup{x x}}}){ALL BUT x}
UNION
EXTEND (r2 NOT MATCHING r1) : {rx := REL{x RAT}{}}Is that the kind of thing you meant?
I would expect SUMMARIZE PER to deliver the desired result on its own. In this construct I would expect to see SUMMARIZE BY with NOT MATCHING, EXTEND, UNION.
Please explain how you expect SUMMARIZE PER to do that on its own.
Yes, I could have use BY in the first operand but you were concerned with the PER variety and I wanted a more general solution.
My poor phrasing: I meant I would have expected to see either the PER version on its own (because AVG already does the right thing) or the BY version plus UNION (because it does not). I don't understand the PER version used with UNION, and would appreciate some clarification.
Here's my attempt to grapple with the AVG problem:
(EXTEND (SUMMARIZE r1 PER(r2 MATCHING r1) : {x := AVG(y)}) : {rx := rel{tup{x x}}}){ALL BUT x}
UNION
EXTEND (r2 NOT MATCHING r1) : {rx := REL{x RAT}{}}Is that the kind of thing you meant?
I would expect SUMMARIZE PER to deliver the desired result on its own. In this construct I would expect to see SUMMARIZE BY with NOT MATCHING, EXTEND, UNION.
Please explain how you expect SUMMARIZE PER to do that on its own.
Yes, I could have use BY in the first operand but you were concerned with the PER variety and I wanted a more general solution.
My poor phrasing: I meant I would have expected to see either the PER version on its own (because AVG already does the right thing) or the BY version plus UNION (because it does not). I don't understand the PER version used with UNION, and would appreciate some clarification.
Quote from Hugh on February 17, 2020, 12:43 pmQuote from dandl on February 16, 2020, 11:39 pmHere's my attempt to grapple with the AVG problem:
(EXTEND (SUMMARIZE r1 PER(r2 MATCHING r1) : {x := AVG(y)}) : {rx := rel{tup{x x}}}){ALL BUT x}
UNION
EXTEND (r2 NOT MATCHING r1) : {rx := REL{x RAT}{}}Is that the kind of thing you meant?
I would expect SUMMARIZE PER to deliver the desired result on its own. In this construct I would expect to see SUMMARIZE BY with NOT MATCHING, EXTEND, UNION.
Please explain how you expect SUMMARIZE PER to do that on its own.
Yes, I could have use BY in the first operand but you were concerned with the PER variety and I wanted a more general solution.
My poor phrasing: I meant I would have expected to see either the PER version on its own (because AVG already does the right thing) or the BY version plus UNION (because it does not). I don't understand the PER version used with UNION, and would appreciate some clarification.
SUMMARIZE r1 PER(r2 MATCHING r1) : {x := AVG(y)} takes the average y value for all r1 tuples for which r2 has a match. It is equivalent to SUMMARIZE r1 BY attrs in the case where every r1 tuple matches some r2 tuple, otherwise not equivalent. Call this expression t1. Then:
EXTEND t1 : {rx := rel{tup{x x}}}){ALL BUT x} adds attribute rx of type RELATION{x RATIONAL} to t1. The rx value in each tuple of the result is the singleton relation whose x value is the x value in that tuple of t1. The projection {ALL BUT x} removes the x attribute. The result, t2, has heading the attributes of r2 plus rx of the type I gave.
EXTEND (r2 NOT MATCHING r1) : {rx := REL{x RAT}{}} takes those r2 tuples that have no counterpart in t2 and extends them with an rx value of the same type as rx in t2, that value being an empty relation. Call this t3. Then t2 UNION t3 shows all the average y values that are shown in t2 plus all the cases where no such value exists.
What more can I say? Well this: I have no idea what you mean by "the PER version on its own (because AVG already does the right thing)". In my book, AVG does the "right thing" to a nonempty list of numbers but not to the empty list because it will raise an exception. The BY version of TD's SUMMARIZE is irrelevant.
Hugh
Quote from dandl on February 16, 2020, 11:39 pmHere's my attempt to grapple with the AVG problem:
(EXTEND (SUMMARIZE r1 PER(r2 MATCHING r1) : {x := AVG(y)}) : {rx := rel{tup{x x}}}){ALL BUT x}
UNION
EXTEND (r2 NOT MATCHING r1) : {rx := REL{x RAT}{}}Is that the kind of thing you meant?
I would expect SUMMARIZE PER to deliver the desired result on its own. In this construct I would expect to see SUMMARIZE BY with NOT MATCHING, EXTEND, UNION.
Please explain how you expect SUMMARIZE PER to do that on its own.
Yes, I could have use BY in the first operand but you were concerned with the PER variety and I wanted a more general solution.
My poor phrasing: I meant I would have expected to see either the PER version on its own (because AVG already does the right thing) or the BY version plus UNION (because it does not). I don't understand the PER version used with UNION, and would appreciate some clarification.
SUMMARIZE r1 PER(r2 MATCHING r1) : {x := AVG(y)} takes the average y value for all r1 tuples for which r2 has a match. It is equivalent to SUMMARIZE r1 BY attrs in the case where every r1 tuple matches some r2 tuple, otherwise not equivalent. Call this expression t1. Then:
EXTEND t1 : {rx := rel{tup{x x}}}){ALL BUT x} adds attribute rx of type RELATION{x RATIONAL} to t1. The rx value in each tuple of the result is the singleton relation whose x value is the x value in that tuple of t1. The projection {ALL BUT x} removes the x attribute. The result, t2, has heading the attributes of r2 plus rx of the type I gave.
EXTEND (r2 NOT MATCHING r1) : {rx := REL{x RAT}{}} takes those r2 tuples that have no counterpart in t2 and extends them with an rx value of the same type as rx in t2, that value being an empty relation. Call this t3. Then t2 UNION t3 shows all the average y values that are shown in t2 plus all the cases where no such value exists.
What more can I say? Well this: I have no idea what you mean by "the PER version on its own (because AVG already does the right thing)". In my book, AVG does the "right thing" to a nonempty list of numbers but not to the empty list because it will raise an exception. The BY version of TD's SUMMARIZE is irrelevant.
Hugh