The Forum for Discussion about The Third Manifesto and Related Matters

You need to log in to create posts and topics.

Scope and lifetime

Scope/lifetime Scalar/tuple variable Relational variable Type Operator Constraint
Local VAR (PRIVATE?) VAR Q2 No Q4
Global VAR PRIVATE VAR Q2 Q3 Q4
Database (global persistent) Q1 REAL VAR TYPE OPERATOR CONSTRAINT

 

 

 

 

[much trouble formatting that table]

The above represents my understanding of TD syntax and the consequences for lifetime and scope.We've discussed Q1. Would it be reasonable/possible to also allow:

  • Q2: local or global non-persistent types?
  • Q3: global non-persistent operators? [Nested operators are something else again.]
  • Q4: local or global non-persistent constraints?

Note that local vs global is just a consequence of lexical placement, but global vs persistent requires an explicit syntax. The word PRIVATE could be adapted to the purpose.

I'm wrestling with this problem for Andl. Currently the limitations are similar to TD; I would prefer to ease them, but I'm troubled by the consequences.


Edit: I realise what the problem is: I really don't like the idea of a syntactic form controlling something that is not syntax. Persistence does not change the way the code executes, it only changes whether the data it accesses is external, in a database. I would prefer all globals (variables, types, etc) to be global or local according to syntax, but for persistence to be set by some other mechanism (such as metadata). So I don't like the REAL/PRIVATE feature in TD.

 

 

Andl - A New Database Language - andl.org
Quote from dandl on September 22, 2019, 4:17 am

...

I'm wrestling with this problem for Andl. Currently the limitations are similar to TD; I would prefer to ease them, but I'm troubled by the consequences.


Edit: I realise what the problem is: I really don't like the idea of a syntactic form controlling something that is not syntax. Persistence does not change the way the code executes, it only changes whether the data it accesses is external, in a database. I would prefer all globals (variables, types, etc) to be global or local according to syntax, but for persistence to be set by some other mechanism (such as metadata). So I don't like the REAL/PRIVATE feature in TD.

 

The terminology 'scope' vs 'extent' dates from ALGOL 60 OWN variables, IIRC. They turned out to be a mis-feature precisely because of this "syntactic form controlling something that is not syntax" (well put). Specifically, there was no syntactic form to control the initialisation of the variable.

So it wasn't until OOP figured out that the extent of every instance of a class is in principle the whole lifetime of the program, that the idea properly got going. And then, as Dave has argued, really encapsulating state bundled with functionality is another mis-feature.

If you want to get the syntax to parallel persistence, then you must treat the database and its schema as having a scope wider than any specific application. (Some applications, such as db servers, have a very long lifetime, but nothing as long as the database.) So each application imports the decls for the database/schema as-if each application were embedded with local scopes inside the database decls.

Then I take your earlier point about actions that affect that wider scope -- such as DROP a relvar. In effect some local scope is changing the type of a global variable (the dbvar). Then one possible account of that semantics: a DROP is an assignment to a fresh dbvar (of a different type), that has the same name as the 'old' dbvar, therefore shadows the scope such that there is only one dbvar of that name in scope at any one point in the linear text of the program.

Or ... we abandon the procedural programming model of TTM, and move to something more like 'Communicating Sequential Processes' [Hoare] or some 'Calculus of Communicating Systems' [Milner], specifically such as 'π-calculus' [Milner et al] . Essentially each application program interaction with the database is a message -- either sent or received. What those Process Calculi are doing is providing a way for the type system to make static guarantees that all message handling is well-typed. Therefore very different to interaction via passing strings. I guess that's getting way too abstruse for the intended audience of TTM.

Quote from dandl on September 22, 2019, 4:17 am
Scope/lifetime Scalar/tuple variable Relational variable Type Operator Constraint
Local VAR (PRIVATE?) VAR Q2 No Q4
Global VAR PRIVATE VAR Q2 Q3 Q4
Database (global persistent) Q1 REAL VAR TYPE OPERATOR CONSTRAINT

 

 

 

 

[much trouble formatting that table]

The above represents my understanding of TD syntax and the consequences for lifetime and scope.We've discussed Q1. Would it be reasonable/possible to also allow:

  • Q2: local or global non-persistent types?
  • Q3: global non-persistent operators? [Nested operators are something else again.]
  • Q4: local or global non-persistent constraints?

Note that local vs global is just a consequence of lexical placement, but global vs persistent requires an explicit syntax. The word PRIVATE could be adapted to the purpose.

I'm wrestling with this problem for Andl. Currently the limitations are similar to TD; I would prefer to ease them, but I'm troubled by the consequences.


Edit: I realise what the problem is: I really don't like the idea of a syntactic form controlling something that is not syntax. Persistence does not change the way the code executes, it only changes whether the data it accesses is external, in a database. I would prefer all globals (variables, types, etc) to be global or local according to syntax, but for persistence to be set by some other mechanism (such as metadata). So I don't like the REAL/PRIVATE feature in TD.

There's an argument here for having a separate data definition language -- akin to the DD statements of JCL on IBM mainframes -- that define or declare available relvars and expose them to the database language as no more than relvar names and lists of attribute names/types. This has the advantage of making database language scripts/applications abstract and decoupled from the actual implementations of the relvars they manipulate. (Aside: I think we've discussed this before. Maybe search the mailing list archives.) It means the database language scripts can't create or drop relvars, but that's arguably a (security and simplicity) benefit rather than a limitation.

Thinking out loud here, and perhaps as another aside, it strikes me that an application's view of a database is conceptually very similar to an application's view of a shared library, such that the approaches we currently take in terms of handling library dependencies, library changes, and exceptions due to library changes can be used equivalently with databases. In other words, at development time we link an application to a set of database definitions, the same way we link an application to a set of library definitions. At application runtime, we do minimal checks (e.g., expected version number and/or existence of relvar names and attribute names/types) to verify that those definitions haven't changed. If they have, abort run and throw it back to the developer to fix the application.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
Quote from Dave Voorhis on September 22, 2019, 12:16 pm
Quote from dandl on September 22, 2019, 4:17 am

...

I'm wrestling with this problem for Andl. Currently the limitations are similar to TD; I would prefer to ease them, but I'm troubled by the consequences.


Edit: I realise what the problem is: I really don't like the idea of a syntactic form controlling something that is not syntax. Persistence does not change the way the code executes, it only changes whether the data it accesses is external, in a database. I would prefer all globals (variables, types, etc) to be global or local according to syntax, but for persistence to be set by some other mechanism (such as metadata). So I don't like the REAL/PRIVATE feature in TD.

There's an argument here for having a separate data definition language -- akin to the DD statements of JCL on IBM mainframes -- that define or declare available relvars and expose them to the database language as no more than relvar names and lists of attribute names/types.

Yes, I'm keen on that idea. But it cuts again's Codd Rule 5 The comprehensive data sublanguage rule; and I can't help feel it would cut against Pres/Pros in TTM although I can't put my finger on it. How does it go with a program that wants to create a temp relvar, where its schema is set at run time from dynamic values?

This has the advantage of making database language scripts/applications abstract and decoupled from the actual implementations of the relvars they manipulate. (Aside: I think we've discussed this before. Maybe search the mailing list archives.) It means the database language scripts can't create or drop relvars, but that's arguably a (security and simplicity) benefit rather than a limitation.

Thinking out loud here, and perhaps as another aside, it strikes me that an application's view of a database is conceptually very similar to an application's view of a shared library, such that the approaches we currently take in terms of handling library dependencies, library changes, and exceptions due to library changes can be used equivalently with databases. In other words, at development time we link an application to a set of database definitions, the same way we link an application to a set of library definitions. At application runtime, we do minimal checks (e.g., expected version number and/or existence of relvar names and attribute names/types) to verify that those definitions haven't changed. If they have, abort run and throw it back to the developer to fix the application.

Yes that's pretty much how the System/38 programming system worked. Data Decls were in a separate sourcefile (actually using a different language vs the applications). You compiled that sourcefile to produce a skeleton/empty schema. You then compiled applications against that skeleton. The object code got an embedded hash/sourcefile timestamp of the schema.

At runtime you could point the application to some arbitrary database; on firing up the application, it checked the tables' hash/timestamp agreed to what was compiled into the application. The check applied whether or not the application ever opened the tables. So if any check failed, the application got booted out, no harm done. (At least not to the database ;-)

Quote from AntC on September 23, 2019, 6:06 am
Quote from Dave Voorhis on September 22, 2019, 12:16 pm
Quote from dandl on September 22, 2019, 4:17 am

...

I'm wrestling with this problem for Andl. Currently the limitations are similar to TD; I would prefer to ease them, but I'm troubled by the consequences.


Edit: I realise what the problem is: I really don't like the idea of a syntactic form controlling something that is not syntax. Persistence does not change the way the code executes, it only changes whether the data it accesses is external, in a database. I would prefer all globals (variables, types, etc) to be global or local according to syntax, but for persistence to be set by some other mechanism (such as metadata). So I don't like the REAL/PRIVATE feature in TD.

There's an argument here for having a separate data definition language -- akin to the DD statements of JCL on IBM mainframes -- that define or declare available relvars and expose them to the database language as no more than relvar names and lists of attribute names/types.

Yes, I'm keen on that idea. But it cuts again's Codd Rule 5 The comprehensive data sublanguage rule; and I can't help feel it would cut against Pres/Pros in TTM although I can't put my finger on it. How does it go with a program that wants to create a temp relvar, where its schema is set at run time from dynamic values?

Codd's Rules are neither regulations nor standards, so I think we can take or leave them as logic dictates. I don't recall any of the pre/pro-scriptions precluding a separate DDL, and indeed the separate-ness may simply be differences in privilege rather than a separate language parser recognising distinct syntax and semantics.

For temporary relvars, I see no problem with tuple-valued and relation-valued variables, same as any scalar (or other type-) valued variable. These have transient scope and lifetime like any other program variable. Once variables belong to the database (aka persistent relvars), and have a wider scope and lifetime than an individual program/script, then they need to be defined by the separate DDL.

 

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
Quote from Dave Voorhis on September 23, 2019, 1:18 pm

Codd's Rules are neither regulations nor standards, so I think we can take or leave them as logic dictates.

I agree.  In fact, I think that TTM's proscription on statements like CREATE DATABASE would certainly be seen by Codd as violating his rule 5.

For temporary relvars, I see no problem with tuple-valued and relation-valued variables, same as any scalar (or other type-) valued variable. These have transient scope and lifetime like any other program variable. Once variables belong to the database (aka persistent relvars), and have a wider scope and lifetime than an individual program/script, then they need to be defined by the separate DDL.

I agree, and would extend exactly the same rules to persistent vs. transient types and operators as well.

Quote from AntC on September 23, 2019, 6:06 am

At runtime you could point the application to some arbitrary database; on firing up the application, it checked the tables' hash/timestamp agreed to what was compiled into the application. The check applied whether or not the application ever opened the tables. So if any check failed, the application got booted out, no harm done. (At least not to the database ;-)

I don't understand how that worked.  "Arbitrary database" presumably means any database that conforms to the schema.  How can a mere timestamp check tell you whether a database conforms to a certain version of a schema or not?

Quote from johnwcowan on September 23, 2019, 1:47 pm
Quote from AntC on September 23, 2019, 6:06 am

At runtime you could point the application to some arbitrary database; on firing up the application, it checked the tables' hash/timestamp agreed to what was compiled into the application. The check applied whether or not the application ever opened the tables. So if any check failed, the application got booted out, no harm done. (At least not to the database ;-)

I don't understand how that worked.  "Arbitrary database" presumably means any database that conforms to the schema.  How can a mere timestamp check tell you whether a database conforms to a certain version of a schema or not?

I don't recall what System/38 did -- my encounters with it were relatively brief and superficial -- but I could easily imagine some D application internal startup code that looks like something like this:

URL dbURL = getLocalRecordOfDatabaseLocation();
GUID expectedDatabaseSerialNumber = getLocalRecordOfDatabaseSerialNumber();

Connection dbconnection = getConnectionToDatabase(dbURL);
if (dbconnection == null) {
  throw new ExceptionFatal("Unable to connect to database at " + dbURL);
}

// The database serial number is changed every time the schema is changed. 
// It's a guid, so it also serves to make sure we've got the right database.
int actualDatabaseSerialNumber = dbconnection.getDatabaseSerialNumber();

if (expectedDatabaseSerialNumber != actualDatabaseSerialNumber) {
  throw new ExceptionFatal("Database or its schema has changed. Expected database serial number " + expectedDatabaseSerialNumber + " but got " + actualDatabaseSerialNumber);
}

// ... use the database ...

 

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
Quote from Dave Voorhis on September 23, 2019, 2:02 pm

I don't recall what System/38 did -- my encounters with it were relatively brief and superficial -- but I could easily imagine some D application internal startup code that looks like something like this:

URL dbURL = getLocalRecordOfDatabaseLocation();
GUID myRecordOfDatabaseSerialNumber = getLocalRecordOfDatabaseSerialNumber();

Connection dbconnection = getConnectionToDatabase(dbURL);
if (dbconnection == null) {
  throw new ExceptionFatal("Unable to connect to database at " + dbURL);
}

// The database serial number is changed every time the schema is changed. 
// It's a guid, so it also serves to make sure we've got the right database.
int actualDatabaseSerialNumber = dbconnection.getDatabaseSerialNumber();

if (myRecordOfDatabaseSerialNumber != actualDatabaseSerialNumber) {
  throw new ExceptionFatal("Database or its schema has changed. Expected database serial number " + myRecordOfDatabaseSerialNumber + " but got " + actualDatabaseSerialNumber);
}

// ... use the database ...

 

Well, if schemas stand in a 1-to-1 relation with databases, that is certainly correct.  But my understanding of "arbitrary database" was that many databases might have the same schema (that is, they all adhere to the same database-level constraints: what relvars with what names and types, etc.)  It seems pointless to say that you can point the application at an arbitrary database when only one can possibly work.

Quote from johnwcowan on September 23, 2019, 2:06 pm
Quote from Dave Voorhis on September 23, 2019, 2:02 pm

I don't recall what System/38 did -- my encounters with it were relatively brief and superficial -- but I could easily imagine some D application internal startup code that looks like something like this:

URL dbURL = getLocalRecordOfDatabaseLocation();
GUID myRecordOfDatabaseSerialNumber = getLocalRecordOfDatabaseSerialNumber();

Connection dbconnection = getConnectionToDatabase(dbURL);
if (dbconnection == null) {
  throw new ExceptionFatal("Unable to connect to database at " + dbURL);
}

// The database serial number is changed every time the schema is changed. 
// It's a guid, so it also serves to make sure we've got the right database.
int actualDatabaseSerialNumber = dbconnection.getDatabaseSerialNumber();

if (myRecordOfDatabaseSerialNumber != actualDatabaseSerialNumber) {
  throw new ExceptionFatal("Database or its schema has changed. Expected database serial number " + myRecordOfDatabaseSerialNumber + " but got " + actualDatabaseSerialNumber);
}

// ... use the database ...

 

Well, if schemas stand in a 1-to-1 relation with databases, that is certainly correct.  But my understanding of "arbitrary database" was that many databases might have the same schema (that is, they all adhere to the same database-level constraints: what relvars with what names and types, etc.)  It seems pointless to say that you can point the application at an arbitrary database when only one can possibly work.

I believe AntC meant the pointing is potentially arbitrary, not that it can connect to arbitrary databases. I.e., the application identifies its database via some potentially-changeable reference rather than some hard and immutable binding.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
Quote from Dave Voorhis on September 23, 2019, 1:18 pm
Quote from AntC on September 23, 2019, 6:06 am
Quote from Dave Voorhis on September 22, 2019, 12:16 pm
Quote from dandl on September 22, 2019, 4:17 am

...

I'm wrestling with this problem for Andl. Currently the limitations are similar to TD; I would prefer to ease them, but I'm troubled by the consequences.


Edit: I realise what the problem is: I really don't like the idea of a syntactic form controlling something that is not syntax. Persistence does not change the way the code executes, it only changes whether the data it accesses is external, in a database. I would prefer all globals (variables, types, etc) to be global or local according to syntax, but for persistence to be set by some other mechanism (such as metadata). So I don't like the REAL/PRIVATE feature in TD.

There's an argument here for having a separate data definition language -- akin to the DD statements of JCL on IBM mainframes -- that define or declare available relvars and expose them to the database language as no more than relvar names and lists of attribute names/types.

Yes, I'm keen on that idea. But it cuts again's Codd Rule 5 The comprehensive data sublanguage rule; and I can't help feel it would cut against Pres/Pros in TTM although I can't put my finger on it. How does it go with a program that wants to create a temp relvar, where its schema is set at run time from dynamic values?

Codd's Rules are neither regulations nor standards, so I think we can take or leave them as logic dictates. I don't recall any of the pre/pro-scriptions precluding a separate DDL, and indeed the separate-ness may simply be differences in privilege rather than a separate language parser recognising distinct syntax and semantics.

For temporary relvars, I see no problem with tuple-valued and relation-valued variables, same as any scalar (or other type-) valued variable. These have transient scope and lifetime like any other program variable. Once variables belong to the database (aka persistent relvars), and have a wider scope and lifetime than an individual program/script, then they need to be defined by the separate DDL.

Yes, this puts the finger on it. Creating variables in a database should be highly intentional, not just an accident of syntax. Ditto deleting. It's reasonably benign if a program connects to and consumes relvars in some database, not too troubling if it updates said database, and quite useful if it can connect to different databases at different times. But it's a real nuisance if the same program behaves differently because of the detritus left from a previous run. I've been having trouble writing test suites and useful programs, because of the need to guard against things that might or might not have been created on a previous run, and I think TD will have similar issues. [BTW types and operators actually cause more grief than relvars, because they can't be kept private.]

For Andl it is enough to add a couple of simple directives to the program, or allow the same to be set by command line switches. A real D should have something rather more substantial.

Andl - A New Database Language - andl.org