Generalized transitive closure explanations?
Quote from johnwcowan on October 8, 2019, 7:51 pmOther than what's in VSS 6, is there any informal explanation of generalized transitive closure with clear examples? All I can find is incomprehensibly formal papers, which may help if you already have an informal understanding — but I don't.
Thanks.
Other than what's in VSS 6, is there any informal explanation of generalized transitive closure with clear examples? All I can find is incomprehensibly formal papers, which may help if you already have an informal understanding — but I don't.
Thanks.
Quote from AntC on October 8, 2019, 8:50 pmIn general [**], each of the Pres/Pros/VSSs are discussed/expanded in a corresponding Section in DTATRM. For VSS 6 it's in Chapter 10 starting p236 in the online PDF. With a clear example.
[**] you have to be a little careful, because DTATRM's discussion is around an out-of-date version of the TTM doco. Some of the wording has changed since, and indeed whole numbered points moved/renumbered. VSS 6 seems to have remained stable.
There's been plenty of discussion of Generalised Transitive Closure on the forum over the years. Alack I fear it's all in the write-only archive. Informally, what characterises GTC over Tutorial D's
TCLOSE
is the possibility of generating fresh tuples with fresh attribute values not originally in the based-on relations, then needing to fold those into the result recursively.
In general [**], each of the Pres/Pros/VSSs are discussed/expanded in a corresponding Section in DTATRM. For VSS 6 it's in Chapter 10 starting p236 in the online PDF. With a clear example.
[**] you have to be a little careful, because DTATRM's discussion is around an out-of-date version of the TTM doco. Some of the wording has changed since, and indeed whole numbered points moved/renumbered. VSS 6 seems to have remained stable.
There's been plenty of discussion of Generalised Transitive Closure on the forum over the years. Alack I fear it's all in the write-only archive. Informally, what characterises GTC over Tutorial D's TCLOSE
is the possibility of generating fresh tuples with fresh attribute values not originally in the based-on relations, then needing to fold those into the result recursively.
Quote from johnwcowan on October 8, 2019, 8:53 pmI did in fact read that as closely as I could, but it's not clear enough for me. Further explanations would be very helpful.
I did in fact read that as closely as I could, but it's not clear enough for me. Further explanations would be very helpful.
Quote from Vadim on October 8, 2019, 10:06 pmhttps://vadimtropashko.files.wordpress.com/2014/01/book_sql_chap6_v1.pdf, page 25?
https://vadimtropashko.files.wordpress.com/2014/01/book_sql_chap6_v1.pdf, page 25?
Quote from dandl on October 8, 2019, 11:24 pmGTC is simple enough. It's simply a combination of the TC in Tutorial D with aggregation, allowing one to compute the 'value' of a path through a suitably constructed relational value. Here is what I wrote previously:
GTC is more than TC and less than DTARM.
DTATRM p 239 proposes that GTC requires two relations and two operators. IMO the minimum required for GTC is one relation and one operator:
- An input relation R with 3 attributes { X T, Y T, Z C } where
- X and Y of the same type represent a path from X to Y
- Z is the ‘value’ of that path.
- A dyadic operator F over type C that computes a new value for Z as two paths are combined to add a new one.
The output is R with new tuples added. The final summation of paths is easily handled by generalised aggregation over R. The second operator in DTATRM is not needed as part of GTC.
IMO what is presented as GTC is not very generalised and in point of fact not all that useful. [Which may be why TD does not implement it.] There is a far more useful construct called
while
which is described in some detail in Alice, and implemented in Andl.The
while
operator is the one major feature that can be added to the first order RA without the need for a type system. I'm surprised it's not better known. Essentially it introduces fixpoint recursion to allow new tuples to be added to a relation expanding paths represented by self-joins. Both TC and GTC can be expressed as special cases ofwhile.
GTC is simple enough. It's simply a combination of the TC in Tutorial D with aggregation, allowing one to compute the 'value' of a path through a suitably constructed relational value. Here is what I wrote previously:
GTC is more than TC and less than DTARM.
DTATRM p 239 proposes that GTC requires two relations and two operators. IMO the minimum required for GTC is one relation and one operator:
- An input relation R with 3 attributes { X T, Y T, Z C } where
- X and Y of the same type represent a path from X to Y
- Z is the ‘value’ of that path.
- A dyadic operator F over type C that computes a new value for Z as two paths are combined to add a new one.
The output is R with new tuples added. The final summation of paths is easily handled by generalised aggregation over R. The second operator in DTATRM is not needed as part of GTC.
IMO what is presented as GTC is not very generalised and in point of fact not all that useful. [Which may be why TD does not implement it.] There is a far more useful construct called while
which is described in some detail in Alice, and implemented in Andl.
The while
operator is the one major feature that can be added to the first order RA without the need for a type system. I'm surprised it's not better known. Essentially it introduces fixpoint recursion to allow new tuples to be added to a relation expanding paths represented by self-joins. Both TC and GTC can be expressed as special cases of while.
Quote from AntC on October 9, 2019, 3:30 amQuote from dandl on October 8, 2019, 11:24 pm...
IMO what is presented as GTC is not very generalised and in point of fact not all that useful.
Indeed. If we're thinking outside of TTM, I'm of the school of: let's use the right tool for the job. Expressing not-very-GTC in a D " is like a dog's walking on his hind legs. It is not done well; but you are surprised to find it done at all." [Dr Johnson]
If you have a tool like Google's map-reduce and/or lambda expressions and/or maps and folds, use the database engine to give you a stream of raw tuples from the database, do the rest natively in some Higher-Order language. For full functionality, you'll probably need the credit card transform aka Tying the knot.
I've a strong suspicion such a technique was anticipated in Hall, Hitchcock, Todd 1975 -- but probably they kept it sotto voce for fear of scaring the horses.
Quote from dandl on October 8, 2019, 11:24 pm...
IMO what is presented as GTC is not very generalised and in point of fact not all that useful.
Indeed. If we're thinking outside of TTM, I'm of the school of: let's use the right tool for the job. Expressing not-very-GTC in a D " is like a dog's walking on his hind legs. It is not done well; but you are surprised to find it done at all." [Dr Johnson]
If you have a tool like Google's map-reduce and/or lambda expressions and/or maps and folds, use the database engine to give you a stream of raw tuples from the database, do the rest natively in some Higher-Order language. For full functionality, you'll probably need the credit card transform aka Tying the knot.
I've a strong suspicion such a technique was anticipated in Hall, Hitchcock, Todd 1975 -- but probably they kept it sotto voce for fear of scaring the horses.
Quote from AntC on October 9, 2019, 4:54 amQuote from johnwcowan on October 8, 2019, 8:53 pmI did in fact read that as closely as I could, but it's not clear enough for me. Further explanations would be very helpful.
Take the airline routes example from the end of that section in DTATRM. To clarify what it's saying about the relvars: "... the arcs denote direct flights, ...".
We want to travel from A to D, but there's no direct flight. We can go via B or via C. It's possible the via-B route has the same flying time, same departure and arrival times as the via-C route. So as well as joining the A-B sector to the B-D sector to get total flight time and depart-A time and arrive-D time; and similarly joining A-C to C-D sectors, we need to generate some extra attribute that distinguishes those routes. It's not good enough to capture the intermediate node (via-B or via-C) because in general there'll be an arbitrary number of intermediates. Note that with
TCLOSE
all we could get would be a result tuple with A-D nodes; we couldn't compute total flying time; nor overall departure and arrival times, because all we can do isJOIN
tuples from the base relvars.There's also possibly routes A-B-C-D, A-E-F-G-D, etc. Probably they're slower/later/more costly than either A-B-D or A-C-D, but we don't know until we've formed the transitive closure with those aggregates. To form those possible routes, we need to join A-B-C (which is not a direct flight), then treat that as a tuple to join to C-D.
Quote from johnwcowan on October 8, 2019, 8:53 pmI did in fact read that as closely as I could, but it's not clear enough for me. Further explanations would be very helpful.
Take the airline routes example from the end of that section in DTATRM. To clarify what it's saying about the relvars: "... the arcs denote direct flights, ...".
We want to travel from A to D, but there's no direct flight. We can go via B or via C. It's possible the via-B route has the same flying time, same departure and arrival times as the via-C route. So as well as joining the A-B sector to the B-D sector to get total flight time and depart-A time and arrive-D time; and similarly joining A-C to C-D sectors, we need to generate some extra attribute that distinguishes those routes. It's not good enough to capture the intermediate node (via-B or via-C) because in general there'll be an arbitrary number of intermediates. Note that with TCLOSE
all we could get would be a result tuple with A-D nodes; we couldn't compute total flying time; nor overall departure and arrival times, because all we can do is JOIN
tuples from the base relvars.
There's also possibly routes A-B-C-D, A-E-F-G-D, etc. Probably they're slower/later/more costly than either A-B-D or A-C-D, but we don't know until we've formed the transitive closure with those aggregates. To form those possible routes, we need to join A-B-C (which is not a direct flight), then treat that as a tuple to join to C-D.
Quote from Dave Voorhis on October 9, 2019, 6:12 amQuote from dandl on October 8, 2019, 11:24 pmGTC is simple enough. It's simply a combination of the TC in Tutorial D with aggregation, allowing one to compute the 'value' of a path through a suitably constructed relational value. Here is what I wrote previously:
GTC is more than TC and less than DTARM.
DTATRM p 239 proposes that GTC requires two relations and two operators. IMO the minimum required for GTC is one relation and one operator:
- An input relation R with 3 attributes { X T, Y T, Z C } where
- X and Y of the same type represent a path from X to Y
- Z is the ‘value’ of that path.
- A dyadic operator F over type C that computes a new value for Z as two paths are combined to add a new one.
The output is R with new tuples added. The final summation of paths is easily handled by generalised aggregation over R. The second operator in DTATRM is not needed as part of GTC.
IMO what is presented as GTC is not very generalised and in point of fact not all that useful. [Which may be why TD does not implement it.] There is a far more useful construct called
while
which is described in some detail in Alice, and implemented in Andl.The
while
operator is the one major feature that can be added to the first order RA without the need for a type system. I'm surprised it's not better known. Essentially it introduces fixpoint recursion to allow new tuples to be added to a relation expanding paths represented by self-joins. Both TC and GTC can be expressed as special cases ofwhile.
In a conventional imperative language like Tutorial D, being able to define conventional recursive operators obviates the need for
while
. Likewise for general transitive closure.
Quote from dandl on October 8, 2019, 11:24 pmGTC is simple enough. It's simply a combination of the TC in Tutorial D with aggregation, allowing one to compute the 'value' of a path through a suitably constructed relational value. Here is what I wrote previously:
GTC is more than TC and less than DTARM.
DTATRM p 239 proposes that GTC requires two relations and two operators. IMO the minimum required for GTC is one relation and one operator:
- An input relation R with 3 attributes { X T, Y T, Z C } where
- X and Y of the same type represent a path from X to Y
- Z is the ‘value’ of that path.
- A dyadic operator F over type C that computes a new value for Z as two paths are combined to add a new one.
The output is R with new tuples added. The final summation of paths is easily handled by generalised aggregation over R. The second operator in DTATRM is not needed as part of GTC.
IMO what is presented as GTC is not very generalised and in point of fact not all that useful. [Which may be why TD does not implement it.] There is a far more useful construct called
while
which is described in some detail in Alice, and implemented in Andl.The
while
operator is the one major feature that can be added to the first order RA without the need for a type system. I'm surprised it's not better known. Essentially it introduces fixpoint recursion to allow new tuples to be added to a relation expanding paths represented by self-joins. Both TC and GTC can be expressed as special cases ofwhile.
In a conventional imperative language like Tutorial D, being able to define conventional recursive operators obviates the need for while
. Likewise for general transitive closure.
Quote from dandl on October 9, 2019, 7:26 amQuote from Dave Voorhis on October 9, 2019, 6:12 amQuote from dandl on October 8, 2019, 11:24 pmGTC is simple enough. It's simply a combination of the TC in Tutorial D with aggregation, allowing one to compute the 'value' of a path through a suitably constructed relational value. Here is what I wrote previously:
GTC is more than TC and less than DTARM.
DTATRM p 239 proposes that GTC requires two relations and two operators. IMO the minimum required for GTC is one relation and one operator:
- An input relation R with 3 attributes { X T, Y T, Z C } where
- X and Y of the same type represent a path from X to Y
- Z is the ‘value’ of that path.
- A dyadic operator F over type C that computes a new value for Z as two paths are combined to add a new one.
The output is R with new tuples added. The final summation of paths is easily handled by generalised aggregation over R. The second operator in DTATRM is not needed as part of GTC.
IMO what is presented as GTC is not very generalised and in point of fact not all that useful. [Which may be why TD does not implement it.] There is a far more useful construct called
while
which is described in some detail in Alice, and implemented in Andl.The
while
operator is the one major feature that can be added to the first order RA without the need for a type system. I'm surprised it's not better known. Essentially it introduces fixpoint recursion to allow new tuples to be added to a relation expanding paths represented by self-joins. Both TC and GTC can be expressed as special cases ofwhile.
In a conventional imperative language like Tutorial D, being able to define conventional recursive operators obviates the need for
while
. Likewise for general transitive closure.
In a conventional imperative language like Tutorial D, being able to define conventional recursive operators provides a useful alternative to
while
, with many of the benefits, some particular strengths but also some shortcomings. Likewise for general transitive closure.TD itself can only express recursion explicitly, by coding an algorithm in terms of explicit use of the type system, so in principle placing some query optimisations out of reach.
It cannot code for any kind of generalised operator, or generalised aggregation in conjunction with transitive closure., or AFAIK queries that are provably safe (guaranteed to terminate). A more powerful Industrial D might, but does not yet exist.
Quote from Dave Voorhis on October 9, 2019, 6:12 amQuote from dandl on October 8, 2019, 11:24 pmGTC is simple enough. It's simply a combination of the TC in Tutorial D with aggregation, allowing one to compute the 'value' of a path through a suitably constructed relational value. Here is what I wrote previously:
GTC is more than TC and less than DTARM.
DTATRM p 239 proposes that GTC requires two relations and two operators. IMO the minimum required for GTC is one relation and one operator:
- An input relation R with 3 attributes { X T, Y T, Z C } where
- X and Y of the same type represent a path from X to Y
- Z is the ‘value’ of that path.
- A dyadic operator F over type C that computes a new value for Z as two paths are combined to add a new one.
The output is R with new tuples added. The final summation of paths is easily handled by generalised aggregation over R. The second operator in DTATRM is not needed as part of GTC.
IMO what is presented as GTC is not very generalised and in point of fact not all that useful. [Which may be why TD does not implement it.] There is a far more useful construct called
while
which is described in some detail in Alice, and implemented in Andl.The
while
operator is the one major feature that can be added to the first order RA without the need for a type system. I'm surprised it's not better known. Essentially it introduces fixpoint recursion to allow new tuples to be added to a relation expanding paths represented by self-joins. Both TC and GTC can be expressed as special cases ofwhile.
In a conventional imperative language like Tutorial D, being able to define conventional recursive operators obviates the need for
while
. Likewise for general transitive closure.
In a conventional imperative language like Tutorial D, being able to define conventional recursive operators provides a useful alternative to while
, with many of the benefits, some particular strengths but also some shortcomings. Likewise for general transitive closure.
TD itself can only express recursion explicitly, by coding an algorithm in terms of explicit use of the type system, so in principle placing some query optimisations out of reach.
It cannot code for any kind of generalised operator, or generalised aggregation in conjunction with transitive closure., or AFAIK queries that are provably safe (guaranteed to terminate). A more powerful Industrial D might, but does not yet exist.
Quote from Dave Voorhis on October 9, 2019, 7:36 amQuote from dandl on October 9, 2019, 7:26 amQuote from Dave Voorhis on October 9, 2019, 6:12 amQuote from dandl on October 8, 2019, 11:24 pmGTC is simple enough. It's simply a combination of the TC in Tutorial D with aggregation, allowing one to compute the 'value' of a path through a suitably constructed relational value. Here is what I wrote previously:
GTC is more than TC and less than DTARM.
DTATRM p 239 proposes that GTC requires two relations and two operators. IMO the minimum required for GTC is one relation and one operator:
- An input relation R with 3 attributes { X T, Y T, Z C } where
- X and Y of the same type represent a path from X to Y
- Z is the ‘value’ of that path.
- A dyadic operator F over type C that computes a new value for Z as two paths are combined to add a new one.
The output is R with new tuples added. The final summation of paths is easily handled by generalised aggregation over R. The second operator in DTATRM is not needed as part of GTC.
IMO what is presented as GTC is not very generalised and in point of fact not all that useful. [Which may be why TD does not implement it.] There is a far more useful construct called
while
which is described in some detail in Alice, and implemented in Andl.The
while
operator is the one major feature that can be added to the first order RA without the need for a type system. I'm surprised it's not better known. Essentially it introduces fixpoint recursion to allow new tuples to be added to a relation expanding paths represented by self-joins. Both TC and GTC can be expressed as special cases ofwhile.
In a conventional imperative language like Tutorial D, being able to define conventional recursive operators obviates the need for
while
. Likewise for general transitive closure.
In a conventional imperative language like Tutorial D, being able to define conventional recursive operators provides a useful alternative to
while
, with many of the benefits, some particular strengths but also some shortcomings. Likewise for general transitive closure.TD itself can only express recursion explicitly, by coding an algorithm in terms of explicit use of the type system, so in principle placing some query optimisations out of reach.
It cannot code for any kind of generalised operator, or generalised aggregation in conjunction with transitive closure., or AFAIK queries that are provably safe (guaranteed to terminate). A more powerful Industrial D might, but does not yet exist.
These are always tradeoffs. For example, queries that provably terminate either undesirably (for some) constrains the language or requires an (unsolvable) solution to the halting problem.
Take your pick. Etc.
Quote from dandl on October 9, 2019, 7:26 amQuote from Dave Voorhis on October 9, 2019, 6:12 amQuote from dandl on October 8, 2019, 11:24 pmGTC is simple enough. It's simply a combination of the TC in Tutorial D with aggregation, allowing one to compute the 'value' of a path through a suitably constructed relational value. Here is what I wrote previously:
GTC is more than TC and less than DTARM.
DTATRM p 239 proposes that GTC requires two relations and two operators. IMO the minimum required for GTC is one relation and one operator:
- An input relation R with 3 attributes { X T, Y T, Z C } where
- X and Y of the same type represent a path from X to Y
- Z is the ‘value’ of that path.
- A dyadic operator F over type C that computes a new value for Z as two paths are combined to add a new one.
The output is R with new tuples added. The final summation of paths is easily handled by generalised aggregation over R. The second operator in DTATRM is not needed as part of GTC.
IMO what is presented as GTC is not very generalised and in point of fact not all that useful. [Which may be why TD does not implement it.] There is a far more useful construct called
while
which is described in some detail in Alice, and implemented in Andl.The
while
operator is the one major feature that can be added to the first order RA without the need for a type system. I'm surprised it's not better known. Essentially it introduces fixpoint recursion to allow new tuples to be added to a relation expanding paths represented by self-joins. Both TC and GTC can be expressed as special cases ofwhile.
In a conventional imperative language like Tutorial D, being able to define conventional recursive operators obviates the need for
while
. Likewise for general transitive closure.
In a conventional imperative language like Tutorial D, being able to define conventional recursive operators provides a useful alternative to
while
, with many of the benefits, some particular strengths but also some shortcomings. Likewise for general transitive closure.TD itself can only express recursion explicitly, by coding an algorithm in terms of explicit use of the type system, so in principle placing some query optimisations out of reach.
It cannot code for any kind of generalised operator, or generalised aggregation in conjunction with transitive closure., or AFAIK queries that are provably safe (guaranteed to terminate). A more powerful Industrial D might, but does not yet exist.
These are always tradeoffs. For example, queries that provably terminate either undesirably (for some) constrains the language or requires an (unsolvable) solution to the halting problem.
Take your pick. Etc.