The Forum for Discussion about The Third Manifesto and Related Matters

You need to log in to create posts and topics.

Help sought to fix RM Pre 21 (multiple assignment)

Quote from dandl on November 25, 2019, 1:49 am

 

Gospel: There is no assignment to a component. A component is not a variable. It's an illusion, propagated by Tutorial D which makes it appear so.

...And the type constraint is violated. But all this proves that multiple assignment was never needed or possible for this case. It is easy to see that this can be rewritten as a single assignment. A compiler would and should emit an error, not do the rewrite.

THE_ pseudovariables were meant as the counterpart to OO setters.  So the "makes it appear so" was very deliberate and on purpose.  Everything can always be rewritten in another form.  Programmers tend to prefer the shortest.

Quote from AntC on November 25, 2019, 1:29 am

Type constraints cannot be violated because they define the set of values of a type; other values do not exist and cannot be created. There are no type errors, they just never happen.

That might be true in the languages/type systems you're familiar with. I don't see it as true in TTM's system with declared constraints between components of a PossRep; nor with the ability in TTM to assign to components individually; nor with Multiple PossReps. Please give some example languages that support all three of those 'features'.

This has nothing to do with existing languages, it is purely based on RM Pre 1, where a type is a named set of values. There is nothing in TTM about implementation concerns such as constraints between components, the set of values is finite and pre-determined. A type is a named set of values, period. The values are the values. RM Pre 4a1 sets out a correspondence between parameter values and components. A successful invocation of selector S produces a value of type T. If the parameter values do not correspond to a value then the selector invocation does not succeed. Presumably then the sky falls in. A selector can never produce a 'value that is not a value'.

But I will concede one point: in the ELLIPSE case the compiler cannot determine the set of values that constitute a type, and that's not good. Elsewhere TTM requires strong typing to the point of obsession. RM Pre 1 and/or RM 23a might well add the sentence:

The set of values constituting a type shall be well defined and known to both the system and the user.

Consider these ELLIPSE values (can you say "they just never happen"?):

ELLIPSE(1.0, 2.0)    // not valid

You can write the code, but the selector invocation will not succeed. TTM does not define what happens next, so by definition this is undefined behaviour. Rainbows and unicorns. But no 'value that is not a value'. My proposed addition would avoid this.

Database constraints must be satisfied for an assignment to succeed.

All constraints must be satisfied for an assignment to succeed. The question is where we count the assignment as succeeding: at the comma or at the semicolon?

The assignment is the whole dingus, up to the semicolon. All the assignments happen at once, together. That's the whole point of RM Pre 21. If they fail because of a database constraint they all fail together. Presumably this links in some way with OO Pre 4 transactions, but that is not spelled out.

For a multiple assignment they succeed as a whole or none of them do (and presumably some other action is taken, such as an exception or error). There is no mechanism to test or violate a database constraint except 'at the semicolon'.

Nonsense: RM Pre 23 says the constraint " shall be satisfied if and only if that boolean expression evaluates to TRUE,". We can evaluate a boolean expression at the comma. (Or we could change all the commas to semicolons, for the purposes of constraint-checking.)

Then you aren't reading RM Pre 21, where is says the MA is an 'atomic operation' and the note to RM Pre 23 where it explicitly bundles together compensating operations into a single MA, and explicitly anticipates parts of the MA violating the DBC if executed in isolation.

To me these are simply straightforward readings of RM Pre 23.

No: look at the wording. You're importing a whole load of pre-judgments and expectations from other languages/type systems.

Far from it. My expectations would lead me to something very different. I'm just reading what's there. What are you reading?

At an implementation level, your example is bundled into a single set-oriented merge. The insertion of {1,2} into {3,4,5,6} succeeds, the insertion of {1,2} into (1,3,4,5} or {2,3,4,5} fails. There is no 'at the comma'.

Not adequate if there are other assignments appearing between Erwin's two INSERTs. See my example just posted. If Erwin is correctly interpreting the algorithm in RM Pre 21, then the algorithm is not adequate. I say "if" because I plain don't know; I don't get the intent behind RM Pre 21 for these tricky cases; is the intent valid but RM Pre 21's algorithm flawed/a poor realisation thereof? Is RM Pre 21's algorithm adequate but Erwin's interpretation not accurate?

Then I think you really don't get it. There are no other assignments 'between' anything. The effect of RM Pre 21 is to bundle all the updates of each single variable into one single merged assignment to that variable, each independent of the other, and to execute those assignments atomically, effectively all at the same time. We can agree to dislike the algorithm and the wording, but that's what it says.

Andl - A New Database Language - andl.org
Quote from Erwin on November 25, 2019, 12:24 pm
Quote from dandl on November 25, 2019, 1:49 am

 

Gospel: There is no assignment to a component. A component is not a variable. It's an illusion, propagated by Tutorial D which makes it appear so.

...And the type constraint is violated. But all this proves that multiple assignment was never needed or possible for this case. It is easy to see that this can be rewritten as a single assignment. A compiler would and should emit an error, not do the rewrite.

THE_ pseudovariables were meant as the counterpart to OO setters.  So the "makes it appear so" was very deliberate and on purpose.  Everything can always be rewritten in another form.  Programmers tend to prefer the shortest.

I agree. No argument whatsoever.

The point is that for the purposes of Hugh's example, THE() forms are not variables, so they must be considered as shorthands and eliminated in step RM Pre 21a. In step b when every LHS is a variable (of declared type ELLIPSE), the problem he refers to no longer exists. It evaporates.

Hugh's problem is caused by treating THE() forms as variables, which they are not.

Andl - A New Database Language - andl.org
Quote from Erwin on November 24, 2019, 7:30 pm
Quote from Hugh on November 24, 2019, 3:23 pm

Having read the responses as far as Antc's, I now realise that Chris's description (to me) of the "broad idea" is not quite what we really had in mind.

Take the case where the MA contains two ore more assigns to possrep components of the same variable.  Then these need to be collected together in advance of RM Pre 21a and the values of their RHSes collected into a ordered n-tuple whose components then become arguments to the selector invocation that is assigned to the parent variable.

But perhaps we should just go with Dave Voorhis and Erwin Smout and leave RM Pre 21 as-is.  I agree that it's not really flawed, but it does mean that an implementation that gets around the problem then become non-conforming (strictly speaking).

Hugh

Why "non-conforming" ?  E.g. what would make the process/procedure suggested by Antc (and me and no doubt Dandl even though he wasn't anywhere near as explicit as he should have been) be "non-conforming" ?  Because it won't throw the exception that your (plural) perception of the spec says should be thrown ?  That's odd because if you want the exception to be part of the spec then of course you can't come complaining it poses a problem ...

Yes, because it won't throw the exception that the spec says should be thrown, and we don't want that.  Also, we believe very strongly that assignment to a possrep component is merely shorthand (as in OO languages) and we think that is the right way to express the semantics.  But we have struggled in vain to correct the problem without abandoning the idea of expressing them with syntactic substitution and I just wondered if anybody here would be willing to have a go.  It seems not, and I don't want to participate in a long discussion about the merits and demerits of RM Pre 21's intention.

Hugh

Coauthor of The Third Manifesto and related books.
Quote from Hugh on November 25, 2019, 2:33 pm

Also, we believe very strongly that assignment to a possrep component is merely shorthand (as in OO languages) and we think that is the right way to express the semantics.

The TTM approach -- which defines assignment to a possrep component as shorthand for invoking a selector -- has no equivalent in typical object-oriented languages.

In typical object-oriented languages, the closest equivalent to assignment to a possrep component is either invocation of a method that changes instance state (such a method is often called a 'setter', because it sets the value of an instance's member variable) or direct assignment to an instance's member variable. I'm not aware of any object oriented language in which assignment to a member variable is shorthand for invoking a constructor (which is the nearest equivalent to a selector), though an object oriented language could conceivably be made that would do that.

Or did you mean something different by "as in OO languages"?

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
Quote from Erwin on November 25, 2019, 12:13 pm
Quote from AntC on November 25, 2019, 1:29 am

Not adequate if there are other assignments appearing between Erwin's two INSERTs. See my example just posted. If Erwin is correctly interpreting the algorithm in RM Pre 21, then the algorithm is not adequate. I say "if" because I plain don't know; I don't get the intent behind RM Pre 21 for these tricky cases; is the intent valid but RM Pre 21's algorithm flawed/a poor realisation thereof? Is RM Pre 21's algorithm adequate but Erwin's interpretation not accurate?

["example just posted" copypasted here for convenience]

As it stands, because RM Pre 21 only gives an algorithm, I don't understand the intent. For example consider if there's relvar MYELLIPSES with attribute X of type ELLIPSE:

THE_B(E) := THE_A(E) + 1.0, UPDATE MYELLIPSES WHERE THE_A(X) = THE_A(E) : {THE_B(X) := THE_B(E)},   // comma here, so MA continues

THE_A(E) := THE_A(E) + 2.0, UPDATE MYELLIPSES WHERE THE_B(X) = THE_B(E) : {THE_A(X) := THE_A(E)};

(We'll conveniently ignore that the second UPDATE is probably not doing what the programmer intended.)

At the semicolon all constraints hold. At the first UPDATE, it is vital for that assignment to "see the state of affairs" in which the type constraint on E is violated -- ie THE_B(E) = THE_A(E) + 1.0. Furthermore there's no way to 'expand' those assignments to THE_xs in such a way as to get a single assignment to E and support the first UPDATE to apply as written.

Adrian's old way of explaining was "updates to the same target are done in sequence, updates to different targets are done in parallel".  Don't know if that helps,

That's a complete surprise. I can't take that interpretation out of the wording in RM Pre 21. If that's the (a?) correct interpretation, I feel justified in keeping on asking what is the intent behind RM Pre 21?

but what it means for your example is it won't work as you want.  You get two chainings : UPDATE MYELLIPSES ... , UPDATE MYELLIPSES ... ;   and the second   THE_B(E) := ... , THE_A(E) := ... ;

I would expect (from RM Pre 21) to get a chaining of WITHs, such that each individual assignment is visible to those to its right.

E := WITH (E := ELLIPSE(...) ) (WITH (ELLIPSES := REL{...}) (ELLIPSE(...));

ELLIPSES := (WITH (E := ELLIPSE(...) ) (WITH (ELLIPSES := REL{...}) (WITH (E := ELLIPSE(...)) REL{...}));

In the first chaining, the MYELLIPSES value 'seen' by the first UPDATE is the pre-update one, the MYELLIPSES value 'seen' by the second UPDATE is the one left by the first.  But whatever is happening to E is never seen by this chain.  Any reference to E in this chain evaluates to the pre-update value, and this is contrary to your 'it is vital that ...'.

Nope can't see that interpretation in RM Pre 21. Is that D&D's intent?

 

Quote from Hugh on November 25, 2019, 2:33 pm
Quote from Erwin on November 24, 2019, 7:30 pm
Quote from Hugh on November 24, 2019, 3:23 pm

Having read the responses as far as Antc's, I now realise that Chris's description (to me) of the "broad idea" is not quite what we really had in mind.

Take the case where the MA contains two ore more assigns to possrep components of the same variable.  Then these need to be collected together in advance of RM Pre 21a and the values of their RHSes collected into a ordered n-tuple whose components then become arguments to the selector invocation that is assigned to the parent variable.

But perhaps we should just go with Dave Voorhis and Erwin Smout and leave RM Pre 21 as-is.  I agree that it's not really flawed, but it does mean that an implementation that gets around the problem then become non-conforming (strictly speaking).

Hugh

Why "non-conforming" ?  E.g. what would make the process/procedure suggested by Antc (and me and no doubt Dandl even though he wasn't anywhere near as explicit as he should have been) be "non-conforming" ?  Because it won't throw the exception that your (plural) perception of the spec says should be thrown ?  That's odd because if you want the exception to be part of the spec then of course you can't come complaining it poses a problem ...

Yes, because it won't throw the exception that the spec says should be thrown, and we don't want that.  Also, we believe very strongly that assignment to a possrep component is merely shorthand (as in OO languages) and we think that is the right way to express the semantics.  But we have struggled in vain to correct the problem without abandoning the idea of expressing them with syntactic substitution and I just wondered if anybody here would be willing to have a go.  It seems not, and I don't want to participate in a long discussion about the merits and demerits of RM Pre 21's intention.

Hugh

Okay I'll have a go.  It's not your idea expressed in OP but it's a go.

Between b. and c., add e. :

For an assignment to a scalar variable of the form S := WITH SE1 AS S SE2 where SE2 holds THE_ operator invocations on S (*) and SE1 is a selector for S (**),
- if the S selector used in SE1 is for a different possrep than the THE_ operator in SE2, replace the THE_ operator invocation in SE2 with the equivalent formula/expression (***) expressed in THE_ operators stemming from the possrep that corresponds to the selector used in SE1 (****).  Repeat until all such THE_ invocations for "the wrong possrep" have been replaced.
- replace all the THE_ invocations on S in SE2 (or in the replacing expression that came out of the previous bullet) with the expression for the corresponding parameter used in the selector in SE1
- discard the WITH (unless there are still references left to the introduced name, in which case the whole replacement process must be cancelled/undone/abandoned) (*****)

Repeat for all scalar assignments until no WITH portions can be discarded.

(*) the "references to S" are thus actually references to the result produced by SE1 due to the shadowing by the WITH.
(**) Haven't thought it through that thoroughly, but imo there's little point in pursuing this exercise if SE1 is not a selector but just any arbitrary operator invocation producing a type-of-S value.
(***) And this obviously requires that the system is aware of the equivalences, that is, speaking very loosely, that the system has much more awareness of what goes on inside, say, THE_THETA(POINT) and that it no longer suffices to make it the responsibility of the types-and-operators implementer to see to it that those operators get defined properly and operate properly.
(****) As per my previous example, replace "THE_THETA(S)" with "ATAN2(THE_Y(S) , THE_X(S))".
(*****) This means that the precondition for starting this step is that ***all*** references to S in SE2 are through THE_ operator invocations.

Quote from AntC on November 25, 2019, 7:14 pm

I would expect (from RM Pre 21) to get a chaining of WITHs, such that each individual assignment is visible to those to its right.

...

In the first chaining, the MYELLIPSES value 'seen' by the first UPDATE is the pre-update one, the MYELLIPSES value 'seen' by the second UPDATE is the one left by the first.  But whatever is happening to E is never seen by this chain.  Any reference to E in this chain evaluates to the pre-update value, and this is contrary to your 'it is vital that ...'.

Nope can't see that interpretation in RM Pre 21. Is that D&D's intent?

 

"each individual assignment is visible to those to its right." is one of the two historical versions my recollection says Hugh once said were tried.  It's the one where they observed they could no longer swap using "A := B , B :=A" because the assignment to A was "visible" to the second one, effectively resulting in "A := B , B := B".

It's not interpretation.  It's the fact that step c. says "evaluate all the RHS" and all of that is done before any actual assignment gets done (those only happen in step d.).  Since all evaluation of RHS is done before any assignment is done, it must necessarily be the case that all references appearing anywhere in a RHS must necessarily evaluate to the pre-update value.  Well, all references whose "target" has not been "quietly" altered by the name shadowing technique from step b.

Quote from dandl on November 25, 2019, 12:57 pm

The point is that for the purposes of Hugh's example, THE() forms are not variables, so they must be considered as shorthands and eliminated in step RM Pre 21a. In step b when every LHS is a variable (of declared type ELLIPSE), the problem he refers to no longer exists. It evaporates.

Hugh's problem is caused by treating THE() forms as variables, which they are not.

I really don't understand how you can say that.  The logical conclusion of your "which they are not" would/should be "so stop people from assigning to THE_() forms".  That removes step a. from the prescription, and thus it also removes the point where the OP problem ***gets introduced***.  As opposed to your "no longer exists, evaporates".  If it really evaporated there would not have been a thread here.

FWIW, my understanding of Chris' career was that for a non-negligible period of time, he was an important PL/1 guy within IBM.  PL/1 is that language that supported (still does, by the way) assignments to SUBSTR(charvar,begin,end) and called that technique "pseudovariables".  I suspect he borrowed the idea from PL/1.  I suppose the temptation was somewhere along the lines of "if it can be defined to behave predictably there's no point in depriving people from using it".

Quote from AntC on November 25, 2019, 7:14 pm
Quote from Erwin on November 25, 2019, 12:13 pm

Adrian's old way of explaining was "updates to the same target are done in sequence, updates to different targets are done in parallel".  Don't know if that helps,

That's a complete surprise. I can't take that interpretation out of the wording in RM Pre 21. If that's the (a?) correct interpretation, I feel justified in keeping on asking what is the intent behind RM Pre 21?

Step b. says quite literally "such that Vp and Vq are identical".  That's not the same thing as saying "q=p+1".