The Forum for Discussion about The Third Manifesto and Related Matters

Please or Register to create posts and topics.

Does TRANCLO have to be recursive?

Reading TRANCLO in DTATRM I realised that this is tail recursion. Surely it's better to write it iteratively?

I don't know the syntax well enough, but as a C for-loop it would look something like this:

FOR (TTT := XY UNION ( ( XY RENAME ( Y AS Z ) ) COMPOSE (XY RENAME ( X AS Z ) ) ) ; TTT != XY; XY := TTT) {}
RETURN XY;

 

Andl - A New Database Language - andl.org
Quote from dandl on June 10, 2020, 10:38 am

Reading TRANCLO in DTATRM I realised that this is tail recursion. Surely it's better to write it iteratively?

I don't know the syntax well enough, but as a C for-loop it would look something like this:

FOR (TTT := XY UNION ( ( XY RENAME ( Y AS Z ) ) COMPOSE (XY RENAME ( X AS Z ) ) ) ; TTT != XY; XY := TTT) {}
RETURN XY;

Rel user donated this, some time ago:

OPERATOR TCLOSE(r RELATION {x INT, y INT}) RETURNS SAME_TYPE_AS(r);
  VAR result PRIVATE INIT(r) KEY {x, y};
  VAR work PRIVATE INIT(r) KEY {x, y};
  WHILE NOT IS_EMPTY(work);
    work := ((r RENAME {y AS z}) COMPOSE (work RENAME {x AS z})) MINUS result;
    INSERT result work;
  END WHILE;
  RETURN result;
END OPERATOR;

(I hope he doesn't mind my pasting it here without attribution to preserve privacy, which I'm happy to add upon request.)

There was a time when every undergraduate computer science student would be asked to do exercises around deciding whether tail-call recursion should be left or converted to an iterative equivalent -- sometimes using transitive closure as an example -- and to defend which they thought better. Full marks were only available for those who took care to define "better", of course, with added marks to those who pointed out that better readability might not be compatible with better performance.

If the host language supports tail-call optimisation, you needn't dispense with recursion if it's otherwise acceptably performant and more readable.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
Quote from Dave Voorhis on June 10, 2020, 12:56 pm
Quote from dandl on June 10, 2020, 10:38 am

Reading TRANCLO in DTATRM I realised that this is tail recursion. Surely it's better to write it iteratively?

I don't know the syntax well enough, but as a C for-loop it would look something like this:

FOR (TTT := XY UNION ( ( XY RENAME ( Y AS Z ) ) COMPOSE (XY RENAME ( X AS Z ) ) ) ; TTT != XY; XY := TTT) {}
RETURN XY;

Rel user donated this, some time ago:

OPERATOR TCLOSE(r RELATION {x INT, y INT}) RETURNS SAME_TYPE_AS(r);
VAR result PRIVATE INIT(r) KEY {x, y};
VAR work PRIVATE INIT(r) KEY {x, y};
WHILE NOT IS_EMPTY(work);
work := ((r RENAME {y AS z}) COMPOSE (work RENAME {x AS z})) MINUS result;
INSERT result work;
END WHILE;
RETURN result;
END OPERATOR;
OPERATOR TCLOSE(r RELATION {x INT, y INT}) RETURNS SAME_TYPE_AS(r); VAR result PRIVATE INIT(r) KEY {x, y}; VAR work PRIVATE INIT(r) KEY {x, y}; WHILE NOT IS_EMPTY(work); work := ((r RENAME {y AS z}) COMPOSE (work RENAME {x AS z})) MINUS result; INSERT result work; END WHILE; RETURN result; END OPERATOR;
OPERATOR TCLOSE(r RELATION {x INT, y INT}) RETURNS SAME_TYPE_AS(r);
  VAR result PRIVATE INIT(r) KEY {x, y};
  VAR work PRIVATE INIT(r) KEY {x, y};
  WHILE NOT IS_EMPTY(work);
    work := ((r RENAME {y AS z}) COMPOSE (work RENAME {x AS z})) MINUS result;
    INSERT result work;
  END WHILE;
  RETURN result;
END OPERATOR;

(I hope he doesn't mind my pasting it here without attribution to preserve privacy, which I'm happy to add upon request.)

There was a time when every undergraduate computer science student would be asked to do exercises around deciding whether tail-call recursion should be left or converted to an iterative equivalent -- sometimes using transitive closure as an example -- and to defend which they thought better. Full marks were only available for those who took care to define "better", of course, with added marks to those who pointed out that better readability might not be compatible with better performance.

If the host language supports tail-call optimisation, you needn't dispense with recursion if it's otherwise acceptably performant and more readable.

Thanks, that's helpful. [Are you sure it's correct? Should the r RENAME be work RENAME? Does it make a difference? How would you test it?]

I've nothing against recursion, and often it makes for better code. However here it looks like it could run into problems with copying large relation values, which an iterative solution like this one will naturally avoid.

Out of curiosity, how is it implemented in Rel? Using this algorithm, or something at a lower level?

Andl - A New Database Language - andl.org
Quote from dandl on June 10, 2020, 1:52 pm
Quote from Dave Voorhis on June 10, 2020, 12:56 pm
Quote from dandl on June 10, 2020, 10:38 am

Reading TRANCLO in DTATRM I realised that this is tail recursion. Surely it's better to write it iteratively?

I don't know the syntax well enough, but as a C for-loop it would look something like this:

FOR (TTT := XY UNION ( ( XY RENAME ( Y AS Z ) ) COMPOSE (XY RENAME ( X AS Z ) ) ) ; TTT != XY; XY := TTT) {}
RETURN XY;

Rel user donated this, some time ago:

OPERATOR TCLOSE(r RELATION {x INT, y INT}) RETURNS SAME_TYPE_AS(r);
VAR result PRIVATE INIT(r) KEY {x, y};
VAR work PRIVATE INIT(r) KEY {x, y};
WHILE NOT IS_EMPTY(work);
work := ((r RENAME {y AS z}) COMPOSE (work RENAME {x AS z})) MINUS result;
INSERT result work;
END WHILE;
RETURN result;
END OPERATOR;
OPERATOR TCLOSE(r RELATION {x INT, y INT}) RETURNS SAME_TYPE_AS(r); VAR result PRIVATE INIT(r) KEY {x, y}; VAR work PRIVATE INIT(r) KEY {x, y}; WHILE NOT IS_EMPTY(work); work := ((r RENAME {y AS z}) COMPOSE (work RENAME {x AS z})) MINUS result; INSERT result work; END WHILE; RETURN result; END OPERATOR;
OPERATOR TCLOSE(r RELATION {x INT, y INT}) RETURNS SAME_TYPE_AS(r);
  VAR result PRIVATE INIT(r) KEY {x, y};
  VAR work PRIVATE INIT(r) KEY {x, y};
  WHILE NOT IS_EMPTY(work);
    work := ((r RENAME {y AS z}) COMPOSE (work RENAME {x AS z})) MINUS result;
    INSERT result work;
  END WHILE;
  RETURN result;
END OPERATOR;

(I hope he doesn't mind my pasting it here without attribution to preserve privacy, which I'm happy to add upon request.)

There was a time when every undergraduate computer science student would be asked to do exercises around deciding whether tail-call recursion should be left or converted to an iterative equivalent -- sometimes using transitive closure as an example -- and to defend which they thought better. Full marks were only available for those who took care to define "better", of course, with added marks to those who pointed out that better readability might not be compatible with better performance.

If the host language supports tail-call optimisation, you needn't dispense with recursion if it's otherwise acceptably performant and more readable.

Thanks, that's helpful. [Are you sure it's correct? Should the r RENAME be work RENAME? Does it make a difference? How would you test it?]

I'm not sure it's correct. I dimly recall trying it after I was sent it, but I don't recall whether I had to fix it or not.

It's straightforward to create test data to run it.

I've nothing against recursion, and often it makes for better code.

I'm going to have to deduct marks -- you haven't defined "better".

However here it looks like it could run into problems with copying large relation values, which an iterative solution like this one will naturally avoid.

Out of curiosity, how is it implemented in Rel? Using this algorithm, or something at a lower level?

In Rel, it's recursive, based on code from DTATRM, and implemented by the tclose Java method from Rel's relational engine:

private static ValueRelation tclose(Generator generator, JoinMap joinMap, AttributeMap projectMap, ValueRelation xy) {
  ValueRelation ttt = xy.union((ValueRelation) xy.join(generator.getDatabase(), joinMap, xy).project(projectMap));
  return (ttt.eq(xy)) ? ttt : tclose(generator, joinMap, projectMap, ttt);
}

public Value tclose() {
  Type attributeType = TypeInteger.getInstance(); // type is irrelevant here; any will do
  Heading left = new Heading();
  left.add("X", attributeType);
  left.add("LINK", attributeType);
  Heading right = new Heading();
  right.add("LINK", attributeType);
  right.add("Y", attributeType);
  Heading joinTarget = new Heading();
  joinTarget.add("X", attributeType);
  joinTarget.add("LINK", attributeType);
  joinTarget.add("Y", attributeType);
  Heading projectTarget = new Heading();
  projectTarget.add("X", attributeType);
  projectTarget.add("Y", attributeType);
  JoinMap joinMap = new JoinMap(joinTarget, left, right);
  AttributeMap projectMap = new AttributeMap(projectTarget, joinTarget);
  return tclose(getGenerator(), joinMap, projectMap, this);
}

You'll note the similarities between it and the example definition of TCLOSE called TRANCLO in Chapter 6 of DTATRM:

OPERATOR TRANCLO ( XY RELATION { X P#, Y P# } ) RETURNS RELATION { X P#, Y P# }; 
   RETURN ( WITH ( XY UNION ( ( XY RENAME (Y AS Z) ) COMPOSE ( XY RENAME (X AS Z) ))) AS TTT: 
      IF TTT = XY THEN TTT ELSE TRANCLO ( TTT ) END IF ); 
END OPERATOR;

 

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
Quote from Dave Voorhis on June 10, 2020, 3:59 pm
Quote from dandl on June 10, 2020, 1:52 pm
Quote from Dave Voorhis on June 10, 2020, 12:56 pm
Quote from dandl on June 10, 2020, 10:38 am

Reading TRANCLO in DTATRM I realised that this is tail recursion. Surely it's better to write it iteratively?

I don't know the syntax well enough, but as a C for-loop it would look something like this:

FOR (TTT := XY UNION ( ( XY RENAME ( Y AS Z ) ) COMPOSE (XY RENAME ( X AS Z ) ) ) ; TTT != XY; XY := TTT) {}
RETURN XY;

Rel user donated this, some time ago:

OPERATOR TCLOSE(r RELATION {x INT, y INT}) RETURNS SAME_TYPE_AS(r);
VAR result PRIVATE INIT(r) KEY {x, y};
VAR work PRIVATE INIT(r) KEY {x, y};
WHILE NOT IS_EMPTY(work);
work := ((r RENAME {y AS z}) COMPOSE (work RENAME {x AS z})) MINUS result;
INSERT result work;
END WHILE;
RETURN result;
END OPERATOR;
OPERATOR TCLOSE(r RELATION {x INT, y INT}) RETURNS SAME_TYPE_AS(r); VAR result PRIVATE INIT(r) KEY {x, y}; VAR work PRIVATE INIT(r) KEY {x, y}; WHILE NOT IS_EMPTY(work); work := ((r RENAME {y AS z}) COMPOSE (work RENAME {x AS z})) MINUS result; INSERT result work; END WHILE; RETURN result; END OPERATOR;
OPERATOR TCLOSE(r RELATION {x INT, y INT}) RETURNS SAME_TYPE_AS(r);
  VAR result PRIVATE INIT(r) KEY {x, y};
  VAR work PRIVATE INIT(r) KEY {x, y};
  WHILE NOT IS_EMPTY(work);
    work := ((r RENAME {y AS z}) COMPOSE (work RENAME {x AS z})) MINUS result;
    INSERT result work;
  END WHILE;
  RETURN result;
END OPERATOR;

(I hope he doesn't mind my pasting it here without attribution to preserve privacy, which I'm happy to add upon request.)

There was a time when every undergraduate computer science student would be asked to do exercises around deciding whether tail-call recursion should be left or converted to an iterative equivalent -- sometimes using transitive closure as an example -- and to defend which they thought better. Full marks were only available for those who took care to define "better", of course, with added marks to those who pointed out that better readability might not be compatible with better performance.

If the host language supports tail-call optimisation, you needn't dispense with recursion if it's otherwise acceptably performant and more readable.

Thanks, that's helpful. [Are you sure it's correct? Should the r RENAME be work RENAME? Does it make a difference? How would you test it?]

I'm not sure it's correct. I dimly recall trying it after I was sent it, but I don't recall whether I had to fix it or not.

It's straightforward to create test data to run it.

Really? If I'm right and that's a bug, it will only fail on data constructed in a particular way, and s far I can't figure out how. Perhaps you can help.

I've nothing against recursion, and often it makes for better code.

I'm going to have to deduct marks -- you haven't defined "better".

It probably isn't that sensitive to the particular definition, but FWIW mine is: (a) correct (b) readable so I can be sure of (a) and (c) satisfies other externally imposed constraints such as on time, within budget, performance, quality/coding standards. Occasionally (c) takes priority over (b).

However here it looks like it could run into problems with copying large relation values, which an iterative solution like this one will naturally avoid.

Out of curiosity, how is it implemented in Rel? Using this algorithm, or something at a lower level?

In Rel, it's recursive, based on code from DTATRM, and implemented by the tclose Java method from Rel's relational engine:

<snip>

I thought you might. I was trying to work out how to do a TCLOSE in a way that performs well on large relation values.  AFAICT the DTATRM solution requires:

  • 3 passes through XY per iteration (2 for COMPOSE, 1 for UNION, ignoring the cost of new tuples)
  • 1 copy of TTT for each iteration except the last (similar to XY)
  • 1 pass each of XY and TTT for the final equality test.

That looks expensive. The relvar example could (but doesn't) do it in:

  • 1 pass through XY to set up the relvar and its index
  • 1 pass through XY per iteration.

For that difference in performance, I prefer to write code that is correct, readable and faster. I was hoping you might have already done that.

 

Andl - A New Database Language - andl.org
Quote from dandl on June 11, 2020, 12:46 am
Quote from Dave Voorhis on June 10, 2020, 3:59 pm
Quote from dandl on June 10, 2020, 1:52 pm
Quote from Dave Voorhis on June 10, 2020, 12:56 pm
Quote from dandl on June 10, 2020, 10:38 am

Reading TRANCLO in DTATRM I realised that this is tail recursion. Surely it's better to write it iteratively?

I don't know the syntax well enough, but as a C for-loop it would look something like this:

FOR (TTT := XY UNION ( ( XY RENAME ( Y AS Z ) ) COMPOSE (XY RENAME ( X AS Z ) ) ) ; TTT != XY; XY := TTT) {}
RETURN XY;

Rel user donated this, some time ago:

OPERATOR TCLOSE(r RELATION {x INT, y INT}) RETURNS SAME_TYPE_AS(r);
VAR result PRIVATE INIT(r) KEY {x, y};
VAR work PRIVATE INIT(r) KEY {x, y};
WHILE NOT IS_EMPTY(work);
work := ((r RENAME {y AS z}) COMPOSE (work RENAME {x AS z})) MINUS result;
INSERT result work;
END WHILE;
RETURN result;
END OPERATOR;
OPERATOR TCLOSE(r RELATION {x INT, y INT}) RETURNS SAME_TYPE_AS(r); VAR result PRIVATE INIT(r) KEY {x, y}; VAR work PRIVATE INIT(r) KEY {x, y}; WHILE NOT IS_EMPTY(work); work := ((r RENAME {y AS z}) COMPOSE (work RENAME {x AS z})) MINUS result; INSERT result work; END WHILE; RETURN result; END OPERATOR;
OPERATOR TCLOSE(r RELATION {x INT, y INT}) RETURNS SAME_TYPE_AS(r);
  VAR result PRIVATE INIT(r) KEY {x, y};
  VAR work PRIVATE INIT(r) KEY {x, y};
  WHILE NOT IS_EMPTY(work);
    work := ((r RENAME {y AS z}) COMPOSE (work RENAME {x AS z})) MINUS result;
    INSERT result work;
  END WHILE;
  RETURN result;
END OPERATOR;

(I hope he doesn't mind my pasting it here without attribution to preserve privacy, which I'm happy to add upon request.)

There was a time when every undergraduate computer science student would be asked to do exercises around deciding whether tail-call recursion should be left or converted to an iterative equivalent -- sometimes using transitive closure as an example -- and to defend which they thought better. Full marks were only available for those who took care to define "better", of course, with added marks to those who pointed out that better readability might not be compatible with better performance.

If the host language supports tail-call optimisation, you needn't dispense with recursion if it's otherwise acceptably performant and more readable.

Thanks, that's helpful. [Are you sure it's correct? Should the r RENAME be work RENAME? Does it make a difference? How would you test it?]

I'm not sure it's correct. I dimly recall trying it after I was sent it, but I don't recall whether I had to fix it or not.

It's straightforward to create test data to run it.

Really? If I'm right and that's a bug, it will only fail on data constructed in a particular way, and s far I can't figure out how. Perhaps you can help.

Years ago, I created multiple transitive closure examples and test data. I'll try to find them.

I've nothing against recursion, and often it makes for better code.

I'm going to have to deduct marks -- you haven't defined "better".

It probably isn't that sensitive to the particular definition, but FWIW mine is: (a) correct (b) readable so I can be sure of (a) and (c) satisfies other externally imposed constraints such as on time, within budget, performance, quality/coding standards. Occasionally (c) takes priority over (b).

A classic example of recursion tradeoffs which illustrate the challenge of "better" is a Fibonacci sequence generator. With recursion:

int fib(int n) {
    return 
       (n <= 1) ? n
                : fib(n - 1) + fib(n - 2);
}

Without:

int fib(int n) {
    if (n <= 1)
        return n;
    int fib = 1;
    int previous = 1;
    for (int i = 2; i < n; i++) {
        int tmp = fib;
        fib += previous;
        previous = tmp;
    }
    return fib;
}

The former is very readable but inefficient. The latter at least more efficient, but less readable (though when this illustration is given, there's always some wag who claims the latter is easier to read.)

Which is better?

However here it looks like it could run into problems with copying large relation values, which an iterative solution like this one will naturally avoid.

Out of curiosity, how is it implemented in Rel? Using this algorithm, or something at a lower level?

In Rel, it's recursive, based on code from DTATRM, and implemented by the tclose Java method from Rel's relational engine:

<snip>

I thought you might.

You thought I might what?

I was trying to work out how to do a TCLOSE in a way that performs well on large relation values.  AFAICT the DTATRM solution requires:

  • 3 passes through XY per iteration (2 for COMPOSE, 1 for UNION, ignoring the cost of new tuples)
  • 1 copy of TTT for each iteration except the last (similar to XY)
  • 1 pass each of XY and TTT for the final equality test.

That looks expensive. The relvar example could (but doesn't) do it in:

  • 1 pass through XY to set up the relvar and its index
  • 1 pass through XY per iteration.

For that difference in performance, I prefer to write code that is correct, readable and faster. I was hoping you might have already done that.

I did, at one point. I'll try to find the examples.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org

Thanks, that's helpful. [Are you sure it's correct? Should the r RENAME be work RENAME? Does it make a difference? How would you test it?]

I'm not sure it's correct. I dimly recall trying it after I was sent it, but I don't recall whether I had to fix it or not.

It's straightforward to create test data to run it.

Really? If I'm right and that's a bug, it will only fail on data constructed in a particular way, and s far I can't figure out how. Perhaps you can help.

Years ago, I created multiple transitive closure examples and test data. I'll try to find them.

Thanks. But the question was: how to create test case(s) for this implementation.

I've nothing against recursion, and often it makes for better code.

I'm going to have to deduct marks -- you haven't defined "better".

It probably isn't that sensitive to the particular definition, but FWIW mine is: (a) correct (b) readable so I can be sure of (a) and (c) satisfies other externally imposed constraints such as on time, within budget, performance, quality/coding standards. Occasionally (c) takes priority over (b).

A classic example of recursion tradeoffs which illustrate the challenge of "better" is a Fibonacci sequence generator. With recursion:

<snip>

Here's my take:

var trip = Tuple.Create(0m, 1m, 1m);
  for (int i = 0; i < n; ++i) {
  trip = Tuple.Create(trip.Item2, trip.Item3, trip.Item2 + trip.Item3);
}
Console.WriteLine($"Fib term {n} is {trip.Item3}");

Output: Fib term 100 is 927372692193078999176

 

The former is very readable but inefficient. The latter at least more efficient, but less readable (though when this illustration is given, there's always some wag who claims the latter is easier to read.)

Which is better?

I prefer mine. It runs out of decimal precision at N=135, so in a sense it doesn't really matter, but the performance will always beat recursive.

However here it looks like it could run into problems with copying large relation values, which an iterative solution like this one will naturally avoid.

Out of curiosity, how is it implemented in Rel? Using this algorithm, or something at a lower level?

In Rel, it's recursive, based on code from DTATRM, and implemented by the tclose Java method from Rel's relational engine:

<snip>

I thought you might.

You thought I might what?

Have a higher performance implementation.

I was trying to work out how to do a TCLOSE in a way that performs well on large relation values.  AFAICT the DTATRM solution requires:

  • 3 passes through XY per iteration (2 for COMPOSE, 1 for UNION, ignoring the cost of new tuples)
  • 1 copy of TTT for each iteration except the last (similar to XY)
  • 1 pass each of XY and TTT for the final equality test.

That looks expensive. The relvar example could (but doesn't) do it in:

  • 1 pass through XY to set up the relvar and its index
  • 1 pass through XY per iteration.

For that difference in performance, I prefer to write code that is correct, readable and faster. I was hoping you might have already done that.

I did, at one point. I'll try to find the examples.

Thanks. And GTC?

Andl - A New Database Language - andl.org
Quote from dandl on June 11, 2020, 9:04 am

Thanks, that's helpful. [Are you sure it's correct? Should the r RENAME be work RENAME? Does it make a difference? How would you test it?]

I'm not sure it's correct. I dimly recall trying it after I was sent it, but I don't recall whether I had to fix it or not.

It's straightforward to create test data to run it.

Really? If I'm right and that's a bug, it will only fail on data constructed in a particular way, and s far I can't figure out how. Perhaps you can help.

Years ago, I created multiple transitive closure examples and test data. I'll try to find them.

Thanks. But the question was: how to create test case(s) for this implementation.

I've nothing against recursion, and often it makes for better code.

I'm going to have to deduct marks -- you haven't defined "better".

It probably isn't that sensitive to the particular definition, but FWIW mine is: (a) correct (b) readable so I can be sure of (a) and (c) satisfies other externally imposed constraints such as on time, within budget, performance, quality/coding standards. Occasionally (c) takes priority over (b).

A classic example of recursion tradeoffs which illustrate the challenge of "better" is a Fibonacci sequence generator. With recursion:

<snip>

Here's my take:

var trip = Tuple.Create(0m, 1m, 1m);
for (int i = 0; i < n; ++i) {
trip = Tuple.Create(trip.Item2, trip.Item3, trip.Item2 + trip.Item3);
}
Console.WriteLine($"Fib term {n} is {trip.Item3}");
Output: Fib term 100 is 927372692193078999176
var trip = Tuple.Create(0m, 1m, 1m); for (int i = 0; i < n; ++i) { trip = Tuple.Create(trip.Item2, trip.Item3, trip.Item2 + trip.Item3); } Console.WriteLine($"Fib term {n} is {trip.Item3}"); Output: Fib term 100 is 927372692193078999176
var trip = Tuple.Create(0m, 1m, 1m);
  for (int i = 0; i < n; ++i) {
  trip = Tuple.Create(trip.Item2, trip.Item3, trip.Item2 + trip.Item3);
}
Console.WriteLine($"Fib term {n} is {trip.Item3}");

Output: Fib term 100 is 927372692193078999176

 

The former is very readable but inefficient. The latter at least more efficient, but less readable (though when this illustration is given, there's always some wag who claims the latter is easier to read.)

Which is better?

I prefer mine. It runs out of decimal precision at N=135, so in a sense it doesn't really matter, but the performance will always beat recursive.

That's better performance. The recursive version has better readability.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org