The Forum for Discussion about The Third Manifesto and Related Matters

You need to log in to create posts and topics.

Anonymous operators revisited

12
Quote from dandl on March 29, 2020, 1:05 am
Quote from AntC on March 28, 2020, 1:58 pm
Quote from dandl on March 28, 2020, 1:10 pm

In passing:the RA and TTM/D implementations typically treat the argument of WHERE/EXTEND or equivalent much like a lambda, with the arguments being the attributes of the 'current tuple'.

Then what type to they infer for the lambda expression? Must have attribute foo at type CHAR; might have other attributes at types unknown. I guess they infer a distinct type at each WHERE/EXTEND/etc expression:

S WHERE S# = S#('S3')
SP WHERE S# = S#('S3')

Those two S# = S#('S3') are distinct lambda expressions, despite their superficial identicality.

Yes they are. But there would seem to be nothing wrong with:

S WHERE FX
LET FX := T => T.S# = S#('S3')

I don't think that's answering the point. What's T? I presume you mean it ranges over tuples; but what's the type of T? That is, if we presume FX can appear on RHS of a WHERE providing the LHS relation's type is 'anything with attribute S# at type S#, and possibly other attributes at type unknown'. In technical language, I'm looking for the 'most general unifier' for the type of T, and I need to infer it from the context of usage of T appearing in the declaration for FX. Put it another way, what's the signature for FX?

What I like about the Appendix A treatment is that it transforms WHERE to <AND>, and transforms the 'Open expression' to a relation literal (or an algorithmic relation, in HHT 1975 terms). OK a relation literal is first-class, a very familiar beast.

IMO HHT is closer than App-A. Those are not relation literals, those are relation functions. Something like

TIMES{A,B}) => { A,B,C := A*B }

Hugh has remarked: those are still relations, whatever their internal structure. I guess "are" is vague, perhaps: evaluate to relations.

And what I like about Tropashko's treatment on top of Appendix A is that it transforms all RA operators to take only relations as operands and return only relations as result. No more of those weird subscripty-things in angle brackets that are in Codd's RA. And we can in effect treat attribute-comma-lists as first-class, manipulate them as sets of attributes, etc.

That would certainly be my preference. I wasn't aware that his was a completed work.

His set of operators is completed work. Or to be precise there's various choices of sets of operators; which sets are each expressively equivalent. So we can regard QBQL as complete, with various ways to express algorithmic relations á la HHT.

The bit that's incomplete, and AFAICT never completable is an axiomatisation of those operators into FOL such that we can prove the equivalence of expressions using them.

Quote from Dave Voorhis on March 28, 2020, 2:33 pm
Quote from AntC on March 28, 2020, 1:58 pm
Quote from dandl on March 28, 2020, 1:10 pm

In passing:the RA and TTM/D implementations typically treat the argument of WHERE/EXTEND or equivalent much like a lambda, with the arguments being the attributes of the 'current tuple'.

Then what type to they infer for the lambda expression? Must have attribute foo at type CHAR; might have other attributes at types unknown. I guess they infer a distinct type at each WHERE/EXTEND/etc expression:

S WHERE S# = S#('S3')
SP WHERE S# = S#('S3')

Those two S# = S#('S3') are distinct lambda expressions, despite their superficial identicality.

What I like about the Appendix A treatment is that it transforms WHERE to <AND>, and transforms the 'Open expression' to a relation literal (or an algorithmic relation, in HHT 1975 terms). OK a relation literal is first-class, a very familiar beast.

And what I like about Tropashko's treatment on top of Appendix A is that it transforms all RA operators to take only relations as operands and return only relations as result. No more of those weird subscripty-things in angle brackets that are in Codd's RA. And we can in effect treat attribute-comma-lists as first-class, manipulate them as sets of attributes, etc.

At the expense of giving up a certain degree of static type safety.

Tradeoffs, as usual.

How so/what are we giving up?

S3 := S WHERE S# = S#('S3');

===> S3 := S <AND> REL{TUP{S# S#('S3')}};

S4 := S WHERE S# = S#('S4') OR CITY = 'Ely';

===> (S <AND> REL{TUP{S# S#('S4')}}) InnerUnion (S <AND> REL{TUP{CITY 'Ely'}});

S5 := S {S#, SNAME};
===> S5 := S InnerUnion (REL{S# S#, SNAME CHAR} {});

S6 := S {ALL BUT CITY};
===> S6 := S Remove (REL{CITY CHAR} {});

There's no extra challenges to static type inference: the type (heading) of <AND>'s result is the union of headings of the operands; the type (heading) of InnerUnion's result is the intersection of headings of the operands. We need an operator per Appendix A's <REMOVE> , repurposed to take a relation as r.h. operand, the type (heading) of its result is the heading of l.h. operand set-minus the heading of the r.h. operand.

In particular, we don't need the hand-wavy funny business in Appendix A that fails to explain how "[<REMOVE>] is equivalent to projecting the relation in question over all of its attributes except the one specified." or "a "macro" operator called <COMPOSE>."

By "no extra challenges" I'm not saying there's no challenges; but the challenges are the the same in nature as type inference is already facing for JOIN aka <AND>.

Quote from AntC on March 29, 2020, 3:20 am
Quote from dandl on March 29, 2020, 1:05 am
Quote from AntC on March 28, 2020, 1:58 pm
Quote from dandl on March 28, 2020, 1:10 pm

In passing:the RA and TTM/D implementations typically treat the argument of WHERE/EXTEND or equivalent much like a lambda, with the arguments being the attributes of the 'current tuple'.

Then what type to they infer for the lambda expression? Must have attribute foo at type CHAR; might have other attributes at types unknown. I guess they infer a distinct type at each WHERE/EXTEND/etc expression:

S WHERE S# = S#('S3')
SP WHERE S# = S#('S3')

Those two S# = S#('S3') are distinct lambda expressions, despite their superficial identicality.

Yes they are. But there would seem to be nothing wrong with:

S WHERE FX
LET FX := T => T.S# = S#('S3')

I don't think that's answering the point. What's T? I presume you mean it ranges over tuples; but what's the type of T? That is, if we presume FX can appear on RHS of a WHERE providing the LHS relation's type is 'anything with attribute S# at type S#, and possibly other attributes at type unknown'. In technical language, I'm looking for the 'most general unifier' for the type of T, and I need to infer it from the context of usage of T appearing in the declaration for FX. Put it another way, what's the signature for FX?

T is the single argument for the lambda FX. Its type is (as per TTM) precisely TUPLE H, where H is the heading of S. I have introduced dot notation, but in TD syntax it would be S# FROM T.

No, I did not intend this as generic, but that would be up to the compiler. I did expect the compiler to infer the type signature, which is TUPLE H RETURNING BOOLEAN.

If this is taken as generic, then the compiler might auto-generate a type-specific instance for each time FX is used. Nothing devious intended.

What I like about the Appendix A treatment is that it transforms WHERE to <AND>, and transforms the 'Open expression' to a relation literal (or an algorithmic relation, in HHT 1975 terms). OK a relation literal is first-class, a very familiar beast.

IMO HHT is closer than App-A. Those are not relation literals, those are relation functions. Something like

TIMES{A,B}) => { A,B,C := A*B }

Hugh has remarked: those are still relations, whatever their internal structure. I guess "are" is vague, perhaps: evaluate to relations.

The Very Important Point here is that there are two distinct kinds of relation:

  • Fact relations contain tuples, each of which is a fact asserted in accordance with a business-oriented predicate
  • Function relations implement a function, which takes relations as inputs/arguments and produces relations as outputs/return values.

S, SP and P are fact relations. Join, Project and TIMES are function relations.

Despite what I wrote above, I prefer to see Selection (WHERE) as equivalent to a Semijoin and a Function relation implementing a Boolean test. WHERE is a shortcut for the two steps: function and join.

I like it because it's 'relations all the way down' and it avoids messy issues of nested scope, but there are still plenty of details to sort out. Like how to avoid wall-to-wall renames.

And what I like about Tropashko's treatment on top of Appendix A is that it transforms all RA operators to take only relations as operands and return only relations as result. No more of those weird subscripty-things in angle brackets that are in Codd's RA. And we can in effect treat attribute-comma-lists as first-class, manipulate them as sets of attributes, etc.

That would certainly be my preference. I wasn't aware that his was a completed work.

His set of operators is completed work. Or to be precise there's various choices of sets of operators; which sets are each expressively equivalent. So we can regard QBQL as complete, with various ways to express algorithmic relations á la HHT.

The bit that's incomplete, and AFAICT never completable is an axiomatisation of those operators into FOL such that we can prove the equivalence of expressions using them.

I must have another look.

Are you saying that you expect function relations to be expressible in FOL? Or what?

Andl - A New Database Language - andl.org
Quote from dandl on March 29, 2020, 5:11 am
Quote from AntC on March 29, 2020, 3:20 am
Quote from dandl on March 29, 2020, 1:05 am
Quote from AntC on March 28, 2020, 1:58 pm
Quote from dandl on March 28, 2020, 1:10 pm

In passing:the RA and TTM/D implementations typically treat the argument of WHERE/EXTEND or equivalent much like a lambda, with the arguments being the attributes of the 'current tuple'.

Then what type to they infer for the lambda expression? Must have attribute foo at type CHAR; might have other attributes at types unknown. I guess they infer a distinct type at each WHERE/EXTEND/etc expression:

S WHERE S# = S#('S3')
SP WHERE S# = S#('S3')

Those two S# = S#('S3') are distinct lambda expressions, despite their superficial identicality.

Yes they are. But there would seem to be nothing wrong with:

S WHERE FX
LET FX := T => T.S# = S#('S3')

I don't think that's answering the point. What's T? I presume you mean it ranges over tuples; but what's the type of T? That is, if we presume FX can appear on RHS of a WHERE providing the LHS relation's type is 'anything with attribute S# at type S#, and possibly other attributes at type unknown'. In technical language, I'm looking for the 'most general unifier' for the type of T, and I need to infer it from the context of usage of T appearing in the declaration for FX. Put it another way, what's the signature for FX?

T is the single argument for the lambda FX. Its type is (as per TTM) precisely TUPLE H, where H is the heading of S.

No that won't work. In the declaration/assignment to FX there's no mention of S. What's the most general unifier for T within FX? You can say that in S WHERE FX, the mgu is combined with the heading for S to specialise T's heading to same as S. Whatever you're inferring for the bare decl FX must also allow for SP WHERE FX, in which T's heading gets specialised to same as SP. This is standard polymorphic type inference.

I have introduced dot notation, but in TD syntax it would be S# FROM T.

No, I did not intend this as generic, but that would be up to the compiler.

You've separated the decl/assignment to FX from the use of FX. So it must be generic.

I did expect the compiler to infer the type signature, which is TUPLE H RETURNING BOOLEAN.

No not adequate; the type must express that usage of FX is only type-safe where T includes attribute S# at type S#. (I mean statically type-safe. Of course if you're interpreting rather than compiling with dynamic typing as per SQL, you can get away with it. Then the decl for FX is more of "a "macro" operator" [per Appendix A] that gets expanded at interpretation time.)

If this is taken as generic, then the compiler might auto-generate a type-specific instance for each time FX is used. Nothing devious intended.

What I like about the Appendix A treatment is that it transforms WHERE to <AND>, and transforms the 'Open expression' to a relation literal (or an algorithmic relation, in HHT 1975 terms). OK a relation literal is first-class, a very familiar beast.

IMO HHT is closer than App-A. Those are not relation literals, those are relation functions. Something like

TIMES{A,B}) => { A,B,C := A*B }

Hugh has remarked: those are still relations, whatever their internal structure. I guess "are" is vague, perhaps: evaluate to relations.

The Very Important Point here is that there are two distinct kinds of relation:

  • Fact relations contain tuples, each of which is a fact asserted in accordance with a business-oriented predicate
  • Function relations implement a function, which takes relations as inputs/arguments and produces relations as outputs/return values.

If you're following HHT, please use their terminology: 'stored' vs 'algorithmic' relations [in the Abstract, and top of p3. in Hugh's annotated version]. All relations assert facts (under a "business-oriented predicate"); any relation with a Functional Dependency "implements a function".

S, SP and P are fact relations. Join, Project and TIMES are function relations.

TIMES is algorithmic, yes. If you're going to represent Join, Project as algorithmic relations, what are their headings, and how do you 'apply' them to operands in the way that Appendix A applies PLUS? Answer: Appendix A applies PLUS using Join (also it needs RENAME if the other operand doesn't include attributes X, Y -- which is a bit that Appendix A skirts round). So you're going to 'apply' Join using Join? That looks like an infinite regress.

Despite what I wrote above, I prefer to see Selection (WHERE) as equivalent to a Semijoin and a Function relation implementing a Boolean test. WHERE is a shortcut for the two steps: function and join.

I like it because it's 'relations all the way down' and it avoids messy issues of nested scope, but there are still plenty of details to sort out. Like how to avoid wall-to-wall renames.

I expected you to say a rename {X := A, Y := B} is a function/algorithmic relation; that gets COMPOSED rather than Joined. But that would lead to another infinite regress.

And what I like about Tropashko's treatment on top of Appendix A is that it transforms all RA operators to take only relations as operands and return only relations as result. No more of those weird subscripty-things in angle brackets that are in Codd's RA. And we can in effect treat attribute-comma-lists as first-class, manipulate them as sets of attributes, etc.

That would certainly be my preference. I wasn't aware that his was a completed work.

His set of operators is completed work. Or to be precise there's various choices of sets of operators; which sets are each expressively equivalent. So we can regard QBQL as complete, with various ways to express algorithmic relations á la HHT.

The bit that's incomplete, and AFAICT never completable is an axiomatisation of those operators into FOL such that we can prove the equivalence of expressions using them.

I must have another look.

There's two really unconnected parts to Tropashko's work. (I expect Vadim would disagree with my "unconnected"; IMO Vadim himself doesn't understand how unconnected they are, which is why it's hard work to figure it out from his writings.)

QBQL is an implementation of an executable algebra/set of operators -- much as there might be an implementation of Appendix A operators. It also includes an implementation of (some) algorithmic relations, for simple comparisons and arithmetic. (And I believe you can extend those algorithmic relations by supplying compiled code.)

Are you saying that you expect function relations to be expressible in FOL? Or what?

The unconnected part of Tropashko is FOL reasoning over the lattice structure induced from expressions using those operators. You can type those expressions into the QBQL command prompt; you can mention in the expressions relvars (to get their headings) that you've declared on the operator-implementation side.

What you can't do is any FOL reasoning about the 'domain-specific' characteristics of the mentioned relations, because the FOL knows nothing about comparisons or arithmetic. The FOL doesn't even know about attribute names, so you can't express that a Functional Dependency holds for a relation [**] -- whether algorithmic or stored. So you can mention an algorithmic relation in the FOL, you can't express that it is algorithmic.

[**] I have a 'gut feel' (no more than that) that you can somewhat express in FOL what it is for a FD to hold. But it would rely on formidable machinery, needing nested Existential quantification and negation s.t. inference becomes intractable. Or rather I should say: providing extra ways in which the already-intractable inference becomes even less tractable.

Then what type to they infer for the lambda expression? Must have attribute foo at type CHAR; might have other attributes at types unknown. I guess they infer a distinct type at each WHERE/EXTEND/etc expression:

S WHERE S# = S#('S3')
SP WHERE S# = S#('S3')

Those two S# = S#('S3') are distinct lambda expressions, despite their superficial identicality.

Yes they are. But there would seem to be nothing wrong with:

S WHERE FX
LET FX := T => T.S# = S#('S3')

I don't think that's answering the point. What's T? I presume you mean it ranges over tuples; but what's the type of T? That is, if we presume FX can appear on RHS of a WHERE providing the LHS relation's type is 'anything with attribute S# at type S#, and possibly other attributes at type unknown'. In technical language, I'm looking for the 'most general unifier' for the type of T, and I need to infer it from the context of usage of T appearing in the declaration for FX. Put it another way, what's the signature for FX?

T is the single argument for the lambda FX. Its type is (as per TTM) precisely TUPLE H, where H is the heading of S.

No that won't work. In the declaration/assignment to FX there's no mention of S. What's the most general unifier for T within FX? You can say that in S WHERE FX, the mgu is combined with the heading for S to specialise T's heading to same as S. Whatever you're inferring for the bare decl FX must also allow for SP WHERE FX, in which T's heading gets specialised to same as SP. This is standard polymorphic type inference.

I'm assuming the compiler treats FX as the declaration of a function with signature TUPLE H RETURNING BOOLEAN, and then checks the definition of FX against that declaration. But if you do this:

S WHERE FX
SP WHERE FX
LET FX := T => T.S# = S#('S3')

Now you get a compile error on the second line, because FX is already declared (but not yet defined). In the generic case the compiler allows the multiple declarations, and resolves the definition by instantiating two copies of the function, one for each declared type signature.

What I like about the Appendix A treatment is that it transforms WHERE to <AND>, and transforms the 'Open expression' to a relation literal (or an algorithmic relation, in HHT 1975 terms). OK a relation literal is first-class, a very familiar beast.

IMO HHT is closer than App-A. Those are not relation literals, those are relation functions. Something like

TIMES{A,B}) => { A,B,C := A*B }

Hugh has remarked: those are still relations, whatever their internal structure. I guess "are" is vague, perhaps: evaluate to relations.

The Very Important Point here is that there are two distinct kinds of relation:

  • Fact relations contain tuples, each of which is a fact asserted in accordance with a business-oriented predicate
  • Function relations implement a function, which takes relations as inputs/arguments and produces relations as outputs/return values.

If you're following HHT, please use their terminology: 'stored' vs 'algorithmic' relations [in the Abstract, and top of p3. in Hugh's annotated version]. All relations assert facts (under a "business-oriented predicate"); any relation with a Functional Dependency "implements a function".

The choice of wording is not accidental. A good part of HHT attempts to generalise across a spectrum of relations using this concept of 'effectiveness', and they go into lots of places I'm not interested in following. I'm just arguing against App-A and its constant literal and possibly infinite relations, which I think are a mistake. It's a function, because it has arguments and a return value. But algorithmic would work too.

What I say is: TIMES is a function relation. Every function relation is invoked in the same way: like RENAME and projection, you have to provide attribute names for its arguments and return value. If we invoke TIMES with suitably named attributes we can use it to calculate the total weight of any given shipment in SP. Something like:

(SP JOIN P{P#,WEIGHT})JOIN TIMES{QTY,WEIGHT,TOTWT} => {S#,P#,QTY,WEIGHT,TOTWT} 

S, SP and P are fact relations. Join, Project and TIMES are function relations.

TIMES is algorithmic, yes. If you're going to represent Join, Project as algorithmic relations, what are their headings, and how do you 'apply' them to operands in the way that Appendix A applies PLUS? Answer: Appendix A applies PLUS using Join (also it needs RENAME if the other operand doesn't include attributes X, Y -- which is a bit that Appendix A skirts round). So you're going to 'apply' Join using Join? That looks like an infinite regress.

No, my intention is simply to point out that Join is a relational function, but I agree, it doesn't really lead anywhere useful.

Despite what I wrote above, I prefer to see Selection (WHERE) as equivalent to a Semijoin and a Function relation implementing a Boolean test. WHERE is a shortcut for the two steps: function and join.

I like it because it's 'relations all the way down' and it avoids messy issues of nested scope, but there are still plenty of details to sort out. Like how to avoid wall-to-wall renames.

I expected you to say a rename {X := A, Y := B} is a function/algorithmic relation; that gets COMPOSED rather than Joined. But that would lead to another infinite regress.

And what I like about Tropashko's treatment on top of Appendix A is that it transforms all RA operators to take only relations as operands and return only relations as result. No more of those weird subscripty-things in angle brackets that are in Codd's RA. And we can in effect treat attribute-comma-lists as first-class, manipulate them as sets of attributes, etc.

That would certainly be my preference. I wasn't aware that his was a completed work.

His set of operators is completed work. Or to be precise there's various choices of sets of operators; which sets are each expressively equivalent. So we can regard QBQL as complete, with various ways to express algorithmic relations á la HHT.

The bit that's incomplete, and AFAICT never completable is an axiomatisation of those operators into FOL such that we can prove the equivalence of expressions using them.

I must have another look.

There's two really unconnected parts to Tropashko's work. (I expect Vadim would disagree with my "unconnected"; IMO Vadim himself doesn't understand how unconnected they are, which is why it's hard work to figure it out from his writings.)

QBQL is an implementation of an executable algebra/set of operators -- much as there might be an implementation of Appendix A operators. It also includes an implementation of (some) algorithmic relations, for simple comparisons and arithmetic. (And I believe you can extend those algorithmic relations by supplying compiled code.)

As should be. That's the bit of interest.

Are you saying that you expect function relations to be expressible in FOL? Or what?

The unconnected part of Tropashko is FOL reasoning over the lattice structure induced from expressions using those operators. You can type those expressions into the QBQL command prompt; you can mention in the expressions relvars (to get their headings) that you've declared on the operator-implementation side.

What you can't do is any FOL reasoning about the 'domain-specific' characteristics of the mentioned relations, because the FOL knows nothing about comparisons or arithmetic. The FOL doesn't even know about attribute names, so you can't express that a Functional Dependency holds for a relation [**] -- whether algorithmic or stored. So you can mention an algorithmic relation in the FOL, you can't express that it is algorithmic.

[**] I have a 'gut feel' (no more than that) that you can somewhat express in FOL what it is for a FD to hold. But it would rely on formidable machinery, needing nested Existential quantification and negation s.t. inference becomes intractable. Or rather I should say: providing extra ways in which the already-intractable inference becomes even less tractable.

OK, I get it. Not really my thing.

 

Andl - A New Database Language - andl.org
12