The Forum for Discussion about The Third Manifesto and Related Matters

You need to log in to create posts and topics.

Help sought to fix RM Pre 21 (multiple assignment)

Quote from Hugh on November 27, 2019, 11:59 am
Quote from Dave Voorhis on November 26, 2019, 5:06 pm
Quote from Hugh on November 26, 2019, 3:43 pm
Quote from Dave Voorhis on November 25, 2019, 3:09 pm
Quote from Hugh on November 25, 2019, 2:33 pm

Also, we believe very strongly that assignment to a possrep component is merely shorthand (as in OO languages) and we think that is the right way to express the semantics.

The TTM approach -- which defines assignment to a possrep component as shorthand for invoking a selector -- has no equivalent in typical object-oriented languages.

In typical object-oriented languages, the closest equivalent to assignment to a possrep component is either invocation of a method that changes instance state (such a method is often called a 'setter', because it sets the value of an instance's member variable) or direct assignment to an instance's member variable. I'm not aware of any object oriented language in which assignment to a member variable is shorthand for invoking a constructor (which is the nearest equivalent to a selector), though an object oriented language could conceivably be made that would do that.

Or did you mean something different by "as in OO languages"?

Well, for what it's worth, the SQL standard used a similar definition in the 1998 edition and I'm not aware of that having changed since.

Let P be a POINT object in some OO language, such that P.X := 3.0 results in P's X value becoming 3.0 while its Y value remains unchanged.  Couldn't that be short for P := POINT(3.0, P.Y)?  I understand that the RHS there is an invocation of a constructor, but does that make an material difference?  (Please excuse my ignorance.)

Hugh

It depends on how "material difference" is defined.

Given, say, variables P and Q which are references to distinct instances of POINT(2.0, 4.0), executing P.X := 3.0 and Q := new POINT(3.0, Q.Y) will result in P.X and Q.X both being equal to 3.0, and P.Y and Q.Y both remaining equal to 4.0.

From that point of view, they are not materially different.

However, after the assignments, P references the same instance it did before but Q references a different instance. Memory was allocated for Q's new instance of POINT, but no new memory was allocated for P.X := 3.0. Furthermore, the accessible methods of a class instance provide an interface to what might be encapsulated (and possibly hidden) additional functionality, such as (for example) POINT might internally maintain a historical log of every assigned X or Y value (the history presumably being accessible via some POINT method.) In that case, P might reference an instance with a lengthy history of assignments to X and Y.  Being newly constructed, the instance referenced by Q would contain no such history.

From that point of view, they are materially different.

Thank you, Dave.  Twenty years ago I sort of knew all that but I suppose I haven't retained it because I never actually used such languages.

So OO assignment to component isn't equivalent to the proposed longhand, and that's because of an ingredient that TTM abjures (pointers).

Hugh

Yes, and not just TTM's avoidance of pointers... A hypothetical object-oriented language could, for example, have mutable instances without explicit pointers or references. You could, for instance, have P.X := 3.0 mutate the X variable component of an instance in variable P. That would be closer to TTM semantics -- in which P.X := 3.0 selects a new value for P -- but still quite different. Being able to mutate instance variables allows preservation of private state (like a history of assignments to X and Y, per my previous example) that in TTM would have to either be thrown away or explicitly exposed and passed to value selectors via parameters on each assignment.

There are pros and cons to both approaches.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
Quote from Hugh on November 27, 2019, 12:13 pm
Quote from dandl on November 26, 2019, 10:54 pm
Quote from Hugh on November 26, 2019, 3:55 pm
Quote from dandl on November 26, 2019, 12:44 am

E' := ELLIPSE(THE_A(E), THE_A(E) + 1.0), E'' := ELLIPSE(THE_A(E') + 2.0, THE_B(E'));

The expression coloured red violates the type constraint for ELLIPSE , so if E' is of type ELLIPSE, then it is impossible for it to acquire or stand for the (non-existent) value of that expression.

This is merely an arrangement of symbols, part of the expansion of shorthands in step a. It will never be executed, so it cannot fail a type constraint.

Thank you.  So E' and E'' stand for text to be regarded as TD syntax.  That's not the intent of RM Pre 21.  TTM doesn't prescribe syntax.  Saying that something in some D is equivalent to something expressed in TD doesn't require that D to be TD.

There's no particular intention to prescribe any syntax: as per your frequent usage TD syntax is a convenient means of expression. The intent is the same as in the TD spec:

for some Ci, and no two distinct <possrep component assign>s specify the same target Ci. Then the original <scalar update> is equivalent to the <scalar assign>
ST := PR ( X1 , X2 , ... , Xn )
(PR here is the selector operator corresponding to the possrep with the same name.)

Assignments to separate components of a scalar variable can always be collected together into a single selector operator and single assignment to the variable. Sorry, but it seems trivially obvious to me. I'm really puzzled why you didn't quote this in your original question.

Andl - A New Database Language - andl.org
Quote from dandl on November 27, 2019, 10:32 pm
Quote from Hugh on November 27, 2019, 12:13 pm
Quote from dandl on November 26, 2019, 10:54 pm
Quote from Hugh on November 26, 2019, 3:55 pm
Quote from dandl on November 26, 2019, 12:44 am

E' := ELLIPSE(THE_A(E), THE_A(E) + 1.0), E'' := ELLIPSE(THE_A(E') + 2.0, THE_B(E'));

The expression coloured red violates the type constraint for ELLIPSE , so if E' is of type ELLIPSE, then it is impossible for it to acquire or stand for the (non-existent) value of that expression.

This is merely an arrangement of symbols, part of the expansion of shorthands in step a. It will never be executed, so it cannot fail a type constraint.

Thank you.  So E' and E'' stand for text to be regarded as TD syntax.  That's not the intent of RM Pre 21.  TTM doesn't prescribe syntax.  Saying that something in some D is equivalent to something expressed in TD doesn't require that D to be TD.

There's no particular intention to prescribe any syntax: as per your frequent usage TD syntax is a convenient means of expression. The intent is the same as in the TD spec:

Hmm ref the side-discussion on the syntax THE_A(E), THE_B(E), ... := THE_A(E)+2.0, THE_A(E)+1.0, ... with Erwin's point to do with updating relvars:

  • There's a need (at least for relvars) to include multiple updates to the same relvar within the same MA;
  • Some have a preference to use INSERT/UPDATE/DELETE syntax between the commas of a Tutorial D/RM Pre 21 -- which won't fit with that balanced simultaneous update syntax.
  • RM Pre 21 is aiming to make a general approach to cover assignment/update to both 'top-level' vars (incl relvars base and virtual) and pseudo-var THE_x.

for some Ci, and no two distinct <possrep component assign>s specify the same target Ci. Then the original <scalar update> is equivalent to the <scalar assign>
ST := PR ( X1 , X2 , ... , Xn )
(PR here is the selector operator corresponding to the possrep with the same name.)

Assignments to separate components of a scalar variable can always be collected together into a single selector operator and single assignment to the variable.

I think not in general:

  • What if there's more than one assignment to the same component?
  • What if there's an assignment via a component of another PossRep whose effect updates more than one component in this PossRep, and then there's an assignment via one of this PossRep's components. For example THE_THETA(myPoint) := PI, THE_X(myPoint) := THE_X(myPoint) + 1.0;
  • (This is something like update via VIRTUAL affecting a BASE relvar that is also target of an update in the same MA.)
  • Note also that an update to a relvar (multiple tuples) could be updating pseudo-variable(s) of a scalar attribute.

Sorry, but it seems trivially obvious to me. I'm really puzzled why you didn't quote this in your original question.

Sorry but it seems horribly complicated to me, and the ramifications in general very far from obvious.

Way back in the thread (relatively speaking) Dave started saying he has severe doubts about MAs with update via pseudo-variables. I'm seeing no reason to disagree with him. (Now that I understand the intent behind RM Pre 21 rather better.)

I continue to think we need a clear statement of the Prescription/principles that RM Pre 21 just does not give. And perhaps give discretion to an implementation to say that some forms of MA are just too hard.

Quote from AntC on November 28, 2019, 2:23 am

Hmm ref the side-discussion on the syntax THE_A(E), THE_B(E), ... := THE_A(E)+2.0, THE_A(E)+1.0, ... with Erwin's point to do with updating relvars:

  • There's a need (at least for relvars) to include multiple updates to the same relvar within the same MA;

There is no need to include multiple assignments. Updates in the form of shorthands such as INSERT/UPDATE/DELETE have to be expanded into assignments in step a, and then merged into a single assignment in step b.

  • Some have a preference to use INSERT/UPDATE/DELETE syntax between the commas of a Tutorial D/RM Pre 21 -- which won't fit with that balanced simultaneous update syntax.
  • RM Pre 21 is aiming to make a general approach to cover assignment/update to both 'top-level' vars (incl relvars base and virtual) and pseudo-var THE_x.

Yes, but also including assignments to variables of other types. But this is ultimately about assignments to variables, as step d makes clear.

for some Ci, and no two distinct <possrep component assign>s specify the same target Ci. Then the original <scalar update> is equivalent to the <scalar assign>
ST := PR ( X1 , X2 , ... , Xn )
(PR here is the selector operator corresponding to the possrep with the same name.)

Assignments to separate components of a scalar variable can always be collected together into a single selector operator and single assignment to the variable.

I think not in general:

  • What if there's more than one assignment to the same component?

Makes no difference, they can always be rolled up into a single assignment as per the TD spec p16. If you say otherwise, please provide an example.

  • What if there's an assignment via a component of another PossRep whose effect updates more than one component in this PossRep, and then there's an assignment via one of this PossRep's components. For example THE_THETA(myPoint) := PI, THE_X(myPoint) := THE_X(myPoint) + 1.0;

I won't comment or speculate on the use of multiple PossReps, or an MA that spans separate PossReps. These start to look a lot like side-effects, and probably should be banned.

  • (This is something like update via VIRTUAL affecting a BASE relvar that is also target of an update in the same MA.)
  • Note also that an update to a relvar (multiple tuples) could be updating pseudo-variable(s) of a scalar attribute.

Sorry, but it seems trivially obvious to me. I'm really puzzled why you didn't quote this in your original question.

Sorry but it seems horribly complicated to me, and the ramifications in general very far from obvious.

Way back in the thread (relatively speaking) Dave started saying he has severe doubts about MAs with update via pseudo-variables. I'm seeing no reason to disagree with him. (Now that I understand the intent behind RM Pre 21 rather better.)

I continue to think we need a clear statement of the Prescription/principles that RM Pre 21 just does not give. And perhaps give discretion to an implementation to say that some forms of MA are just too hard.

I am yet to see an MA that is 'too hard' (caveat multi PossReps). Please provide one.

I agree that the form of Rm  Pre 21 is very difficult, and I would certainly approach it differently. However, so far I am unable to find any flaw. In a language that has assignment to pseudo-variables it needs clarification, and TD spec p16 seeks to do that, again not quite the way I would do it, but well enough.

Andl - A New Database Language - andl.org
Quote from Erwin on November 25, 2019, 7:17 pm
Quote from Hugh on November 25, 2019, 2:33 pm
Quote from Erwin on November 24, 2019, 7:30 pm
Quote from Hugh on November 24, 2019, 3:23 pm

Having read the responses as far as Antc's, I now realise that Chris's description (to me) of the "broad idea" is not quite what we really had in mind.

Take the case where the MA contains two ore more assigns to possrep components of the same variable.  Then these need to be collected together in advance of RM Pre 21a and the values of their RHSes collected into a ordered n-tuple whose components then become arguments to the selector invocation that is assigned to the parent variable.

But perhaps we should just go with Dave Voorhis and Erwin Smout and leave RM Pre 21 as-is.  I agree that it's not really flawed, but it does mean that an implementation that gets around the problem then become non-conforming (strictly speaking).

Hugh

Why "non-conforming" ?  E.g. what would make the process/procedure suggested by Antc (and me and no doubt Dandl even though he wasn't anywhere near as explicit as he should have been) be "non-conforming" ?  Because it won't throw the exception that your (plural) perception of the spec says should be thrown ?  That's odd because if you want the exception to be part of the spec then of course you can't come complaining it poses a problem ...

Yes, because it won't throw the exception that the spec says should be thrown, and we don't want that.  Also, we believe very strongly that assignment to a possrep component is merely shorthand (as in OO languages) and we think that is the right way to express the semantics.  But we have struggled in vain to correct the problem without abandoning the idea of expressing them with syntactic substitution and I just wondered if anybody here would be willing to have a go.  It seems not, and I don't want to participate in a long discussion about the merits and demerits of RM Pre 21's intention.

Hugh

Okay I'll have a go.  It's not your idea expressed in OP but it's a go.

Between b. and c., add e. :

For an assignment to a scalar variable of the form S := WITH SE1 AS S SE2 where SE2 holds THE_ operator invocations on S (*) and SE1 is a selector for S (**),
- if the S selector used in SE1 is for a different possrep than the THE_ operator in SE2, replace the THE_ operator invocation in SE2 with the equivalent formula/expression (***) expressed in THE_ operators stemming from the possrep that corresponds to the selector used in SE1 (****).  Repeat until all such THE_ invocations for "the wrong possrep" have been replaced.
- replace all the THE_ invocations on S in SE2 (or in the replacing expression that came out of the previous bullet) with the expression for the corresponding parameter used in the selector in SE1
- discard the WITH (unless there are still references left to the introduced name, in which case the whole replacement process must be cancelled/undone/abandoned) (*****)

Repeat for all scalar assignments until no WITH portions can be discarded.

(*) the "references to S" are thus actually references to the result produced by SE1 due to the shadowing by the WITH.
(**) Haven't thought it through that thoroughly, but imo there's little point in pursuing this exercise if SE1 is not a selector but just any arbitrary operator invocation producing a type-of-S value.
(***) And this obviously requires that the system is aware of the equivalences, that is, speaking very loosely, that the system has much more awareness of what goes on inside, say, THE_THETA(POINT) and that it no longer suffices to make it the responsibility of the types-and-operators implementer to see to it that those operators get defined properly and operate properly.
(****) As per my previous example, replace "THE_THETA(S)" with "ATAN2(THE_Y(S) , THE_X(S))".
(*****) This means that the precondition for starting this step is that ***all*** references to S in SE2 are through THE_ operator invocations.

I have checked this procedure for how it behaves when "nested" pseudovariables are used :

Type C_ELL possrep C_ELL (C COL , E ELLIPSE);

VAR C_ELL CE INIT C_ELL ( COL(...) , ELLIPSE (2.0, 1.0) );

THE_B(THE_E(CE)) := 3.0 , THE_A(THE_E(CE)) := 5.0 ;

CE' := C_ELL ( THE_C(CE) , ELLIPSE (THE_A(THE_E(CE)) , 3.0 ) ) , CE'' := C_ELL ( THE_C(CE') , ELLIPSE (5.0 , THE_B(THE_E(CE')) ) ) ;
CE' := C_ELL ( THE_C(CE) , ELLIPSE (THE_A(THE_E(CE)) , 3.0 ) ) , CE'' := C_ELL ( THE_C(CE) , ELLIPSE (5.0 , THE_B(ELLIPSE (THE_A(THE_E(CE)) , 3.0 )) ) ) ;

And here it turns out the exception-raising selector is still present (THE_B(ELLIPSE ...)).  An extra step is still needed :

  • replace all expressions of the form THE_X(SELECTOR(...)) where X is a component of a possrep of the type of the selector, by the expression appearing in the selector arguments for that component X. (Possibly once again the equivalences between the possreps might have to be used to "get to the right possrep selector".)

Only if we do that can we obtain

CE' := C_ELL ( THE_C(CE) , ELLIPSE (THE_A(THE_E(CE)) , 3.0 ) ) , CE'' := C_ELL ( THE_C(CE) , ELLIPSE (5.0 , 3.0 ) ) ;
CE'' := C_ELL ( THE_C(CE) , ELLIPSE (5.0 , 3.0 ) ) ;

Quote from Erwin on November 28, 2019, 11:42 am

I have checked this procedure for how it behaves when "nested" pseudovariables are used :

Type C_ELL possrep C_ELL (C COL , E ELLIPSE);

VAR C_ELL CE INIT C_ELL ( COL(...) , ELLIPSE (2.0, 1.0) );

THE_B(THE_E(CE)) := 3.0 , THE_A(THE_E(CE)) := 5.0 ;

CE' := C_ELL ( THE_C(CE) , ELLIPSE (THE_A(THE_E(CE)) , 3.0 ) ) , CE'' := C_ELL ( THE_C(CE') , ELLIPSE (5.0 , THE_B(THE_E(CE')) ) ) ;
CE' := C_ELL ( THE_C(CE) , ELLIPSE (THE_A(THE_E(CE)) , 3.0 ) ) , CE'' := C_ELL ( THE_C(CE) , ELLIPSE (5.0 , THE_B(ELLIPSE (THE_A(THE_E(CE)) , 3.0 )) ) ) ;

And here it turns out the exception-raising selector is still present (THE_B(ELLIPSE ...)).  An extra step is still needed :

  • replace all expressions of the form THE_X(SELECTOR(...)) where X is a component of a possrep of the type of the selector, by the expression appearing in the selector arguments for that component X. (Possibly once again the equivalences between the possreps might have to be used to "get to the right possrep selector".)

I agree. This is a replacement of an expression by another equivalent expression, effectively a compile-time step, avoiding the need to evaluate (at run-time) an expression that may turn out to be invalid. Under suitable conditions and assumptions this can be proved to be correct. It will not work across multiple PossReps if the final value is a blend of both, but I guess that was undefined behaviour anyway. Example:

V := POLAR(...)

THE_X(V) := ...

Only if we do that can we obtain

CE' := C_ELL ( THE_C(CE) , ELLIPSE (THE_A(THE_E(CE)) , 3.0 ) ) , CE'' := C_ELL ( THE_C(CE) , ELLIPSE (5.0 , 3.0 ) ) ;
CE'' := C_ELL ( THE_C(CE) , ELLIPSE (5.0 , 3.0 ) ) ;

Correct.

Andl - A New Database Language - andl.org