The Forum for Discussion about The Third Manifesto and Related Matters

You need to log in to create posts and topics.

Updating through a view: a special case

Quote from johnwcowan on July 3, 2019, 10:55 pm
Quote from Dave Voorhis on July 3, 2019, 7:07 pm

Whenever "Rule 1" comes up it tends to gets interpreted in one of two ways. Depending on who you ask, it will either be, "All information in a relational data base is represented ... in exactly one way -- by values in tables. [Which includes] even table names, column names and domain names ... [i.e.] the built-in system catalog."

On a closer look, that is probably what Codd meant.  However, in D there seem to be many other persistent things: scalar types, keys, operators, and constraints at least.  It is hard to know what is meant by dropping these unless they are persistent: note that non-persistent variables cannot be dropped.

(It is also hard to understand what effects such drops have.  Does an expression mentioning a relvar or operator that has been dropped fail with an exception the next time it is evaluated?  What happens to a relation or tuple value with an attribute of a scalar type that has been dropped?  Or do drops perhaps only take effect after the D program terminates?  In that case, does existing code have to be recompiled?  The whole idea of dropping things inside D, as opposed to with some administrative action like that which creates and drops databases, seems incoherent to me.)

In Rel, all the persistent things -- types, keys, operators, and constraints -- are represented as values in (catalog) relvars. When a database is managed by a Rel DBMS, they are loaded into memory (compiling as required) from relvars, as needed.

Rel tracks dependencies and doesn't allow constructs that have dependencies to be dropped. Of course, this is a rather crude approach but workable enough for its purposes.

Better would be to warn what dependencies will break (request a DROP), allow the user to permit such breakage (demand a DROP) and then the system will clearly indicate all broken constructs via a flag associated with each construct definition -- which, since each is represented via a value in a relvar, can then be revised as needed and will change from broken to usable status once revised.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
Quote from johnwcowan on July 3, 2019, 10:55 pm
Quote from Dave Voorhis on July 3, 2019, 7:07 pm

In the grand scheme of things, I doubt the elegance of a database language would be compromised by supporting persistent scalar variables for global settings -- though they might simply be a syntactic shorthand for retrieving some scalar value from a relvar.

Quite the other way about, I think. Orthogonality is a substantial factor in elegance, and treating one variable type differently from all the rest looks quite inelegant to me.

I was thinking of a simple shorthand for using a relvar of configuration settings (call it Configuration). E.g., in Tutorial D, something like this:

VAR Configuration REAL RELATION {k CHAR, v CHAR} KEY {k};

OPERATOR GetConfigurationValue(lookup CHAR) RETURNS CHAR;
  RETURN v FROM TUPLE FROM (Configuration WHERE k = lookup);
END OPERATOR;

OPERATOR SetConfigurationValue(lookup CHAR, value CHAR);
  IF COUNT(Configuration WHERE k = lookup) = 1 THEN
    UPDATE Configuration WHERE k = lookup: {v := value};
  ELSE
    INSERT Configuration RELATION {TUPLE {k lookup, v value}};
  END IF;
END OPERATOR;

(Note that GetConfigurationValue will cause an error if the lookup value isn't found.)

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
Quote from johnwcowan on July 3, 2019, 3:13 pm

I have not yet fully grasped Appendix E, but I'm interested in the special case of inserting into a projection view where the removed attributes have known values.  This is mentioned in a footnote, but the question is, which values are known?  The three obvious cases are a restriction that equates the removed attribute with a constant expression (one that does not contain any variable names, though it may contain attribute names of the current relation), an equivalent general constraint, and an extension.  In order for such inserts to work, all of the removed variables must be reconstructible in this way.  I think everything will be fine even for non-constant expressions, provided there are no circular dependencies:  given a relvar with one integer attribute X currently holding an empty relation, what is the effect of INSERT INTO var {X := COUNT(var)}?

This should also apply to any other relop that removes columns as part of its work.  What I don't know is whether there are other ways to supply a known value.

(While I am at it, why are public variables restricted to being relvars?  It seems that it should be possible to make a variable of any type persistent.   Certainly you can extend a tuple to a relation of cardinality 1 or a scalar to a relation of degree 1 and cardinality 1, but the artifice is unlovely.)

Regarding your first question, the answer is a syntax error.  Assuming var is a relvar (as required by INSERT), the source for the assignment has to be a relation, not an integer.  In any case, the evaluation of COUNT(var) follows the normal rules for assignment sources.  It happens before anything is update.  Think of X := X + 1.

I see you have had lots of responses regarding the second question (which really doesn't belong in this thread).  I've skimmed some of them but prefer to add my own comment without regard to  others'.

TTM honours Codd's "Information Rule" because we agree with it 100%, as did the designers of SQL's research prototype, System/R, the ISBL team, and the BS12 team (all in Codd's own company at those times, IBM).  If every persistent variable is a relvar, then the update operators available for it and the read-only operators for operating on its value are just those defined for relvars and relations, respectively.  Nobody needs  to learn additional operators, nor investigate the system catalog to determine whether RELATION{t} would be needed because t is a tuple variable or RELATION{TUPLE{s s}} is needed because s is a scalar (whose name I assume would likely be used as attribute name).

I realise, of course, that relational operators such as extension, aggregation, and restriction require the user to use scalar read-only operators on attribute values, but when so much can be achieved with just numbers, character strings, dates, and times, that isn't too much of an extra burden.

I admit that alternative religions could be viable.  See OO databases but recall our motivation for writing what turned out to be Manifesto #3: #1 was about OO databases, #2 about SQL.

Hugh

Coauthor of The Third Manifesto and related books.
Quote from Hugh on July 4, 2019, 10:48 am

 

Regarding your first question, the answer is a syntax error.  Assuming var is a relvar (as required by INSERT), the source for the assignment has to be a relation, not an integer.

A trivial error: I left out the word REL and mangled the punctuation.  My grasp of TD is not all it should be, but what I meant was adding a single tuple.

In any case, the evaluation of COUNT(var) follows the normal rules for assignment sources.  It happens before anything is update.  Think of X := X + 1.

So I assumed, but thanks.

I see you have had lots of responses regarding the second question (which really doesn't belong in this thread).

Characteristically, my asides and minor errors soak up all the attention, and nobody addresses my main concern: what is a known value in the context of inserting into a projection?  At most I get praised (in another thread) for asking it.

Nobody needs  to learn additional operators, nor investigate the system catalog to determine whether RELATION{t} would be needed because t is a tuple variable or RELATION{TUPLE{s s}} is needed because s is a scalar (whose name I assume would likely be used as attribute name).

The same argument would apply with equal force to the provision of private variables of scalar or tuple type, yet they are mandated.   After all, you could deal solely in relations internally as well, so that there are no variables of any other type, only literals.

#1 was about OO databases, #2 about SQL.

Are these manifestos manifest anywhere on the Internet?  If not, can they be made so?  History too has its claims.

Quote from johnwcowan on July 4, 2019, 3:17 pm
Quote from Hugh on July 4, 2019, 10:48 am

Nobody needs  to learn additional operators, nor investigate the system catalog to determine whether RELATION{t} would be needed because t is a tuple variable or RELATION{TUPLE{s s}} is needed because s is a scalar (whose name I assume would likely be used as attribute name).

The same argument would apply with equal force to the provision of private variables of scalar or tuple type, yet they are mandated.   After all, you could deal solely in relations internally as well, so that there are no variables of any other type, only literals.

An imperative language like Tutorial D would be awkward and strange without private (local) variables of scalar or tuple type, but the absence of persistent1 global variables of scalar or tuple type isn't a problem.

--
1 Non-persistent global variables of scalar or tuple type are fine.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
Quote from johnwcowan on July 4, 2019, 3:17 pm
Quote from Hugh on July 4, 2019, 10:48 am

I see you have had lots of responses regarding the second question (which really doesn't belong in this thread).

Characteristically, my asides and minor errors soak up all the attention, and nobody addresses my main concern: what is a known value in the context of inserting into a projection?  At most I get praised (in another thread) for asking it.

I don't know about other threads where that may have happened, but in this one -- and speaking for myself, mainly -- I suspect we're trying to avoid the topic turning into the 14th annual launch of the update-through-views debate.

I'd argue (and I have in the past, if I recall correctly) that inserting into a projection view is one more nail (among many) in the coffin of update-through-views. I'm sure others can relate all manner of proposals for implementing update-through-views on a projection view (defining types and/or relvars to have default values that are used to populate absent attributes; defining update rules; defining this, that, or the other awkward machinery, etc.) along with various proposals for handling all the other problems with update-through-views...

...And all for what?

Nothing, other than meeting some ill-defined goal of "logical data independence", i.e., to be able to modify database schemas without modifying the programs that use them, for a vanishingly small number of use cases where it could work and, at best, might make a little sense if there weren't more reasonable alternatives like scrapping the notion of update-through-views and simply modifying the programs that access the database. It's not that difficult.

Indeed, considering how to build automated software engineering tools to track when database schema changes necessitate application program changes -- and reexamining coupling between applications and database management systems in general -- would be a far more productive expenditure of time and mental effort than yet another fruitless round of debating update-through-views.

Rant, rant, rant... :-)

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
Quote from Dave Voorhis on July 4, 2019, 9:28 pm
Quote from johnwcowan on July 4, 2019, 3:17 pm
Quote from Hugh on July 4, 2019, 10:48 am

I see you have had lots of responses regarding the second question (which really doesn't belong in this thread).

Characteristically, my asides and minor errors soak up all the attention, and nobody addresses my main concern: what is a known value in the context of inserting into a projection?  At most I get praised (in another thread) for asking it.

I don't know about other threads where that may have happened, but in this one -- and speaking for myself, mainly -- I suspect we're trying to avoid the topic turning into the 14th annual launch of the update-through-views debate.

;-) Yes that.

John's title is "Updating", but his first q is about INSERT.

Then John should know there's a spectrum of opinions on the forum (and its predecessors) about updating-through-view in general. I would describe Chris Date's (many) positions as off the end of the spectrum.

There would be some support for UPDATE through a projection, on grounds you could leave the values untouched in the projected-out attributes (presuming you don't thereby break some constraints).

I don't think there'd be much support for INSERT through a projection -- even with all sorts of defaulting of projected-out attributes. Because Normalisation -- i.e. 6NF. And because RM Pro 4 (aka no Nulls):

4. D shall include no concept of a “relation” in which some “tuple” includes some “attribute” that does not
have a value.

If you omit some attribute's value, that's tantamount to saying 'not applicable' or 'not known'/'not recorded'. Then let's represent that explicitly by the absence of a tuple with any such value. If it is applicable/known but usually has some specific value for new inserts, then default it on the input screen, allow the user to override. If it's always some specific value for new inserts, then it's conveying no information at all. (There might be a corollary to the Information Rule 1: everything represented explicitly by values in tables must be information.)

That is, if a tuple makes business sense without those projected-out attributes, then that tuple alone should be the base schema. If later more information comes to hand that 'extends' that tuple, then that should be in a separate relvar with a Foreign Key constraint. (And probably a timestamp or audit trail to explain when/how the info came to hand.)

Rant, rant, rant... :-)

Yeah. There'll be a lot of that.

Quote from AntC on July 4, 2019, 11:40 pm

 

;-) Yes that.

John's title is "Updating", but his first q is about INSERT.

Update in lower case refers, as the TTM book says, to UPDATE, INSERT, and DELETE collectively.  I used it as a generic title.

Then John should know there's a spectrum of opinions on the forum (and its predecessors) about updating-through-view in general. I would describe Chris Date's (many) positions as off the end of the spectrum.

Good to know.

I don't think there'd be much support for INSERT through a projection -- even with all sorts of defaulting of projected-out attributes [...] because RM Pro 4 (aka no Nulls):

I am not talking about defaulted values, which of course will "work".  I am talking about known values, known because of the structure of the underlying relation which was projected as the last step in forming the view.  The footnote I mentioned is on p. 451 of the TTM book (PDF page 462), in Date's section of Appendix E, and reads in part:

For example, consider a view defined as the projection over all but STATUS of S WHERE STATUS = 20, for which a default STATUS value of 20 clearly makes sense, vs. a view defined as the projection over all but STATUS of S WHERE STATUS ≠ 20, for which a default STATUS value of 20 clearly does not. Consider also the fact that the projection over all but Z of EXTEND A ADD (z AS Z) is identically equal to A. Further investigation is obviously needed.

This is precisely the case I have in mind, though I didn't find this footnote till after I got interested.  Here 20 is the known value of STATUS.  I argue in my OP that known values include far more than simple literals, though equality is definitely a necessity.  It rather more than "makes sense", it is true by construction.  So if all the projected-out attributes have such known values, the insertion is legitimate.  (I reject any claim that the number of insertions to be made into the underlying relvar is unknown; in the absence of a join, it is exactly one.)

(There might be a corollary to the Information Rule 1: everything represented explicitly by values in tables must be information.)

Hardly.  If an extension of a relation with attributes A and B adds attribute C whose value is A+B, then C contains no information as such, but such an extension may be useful nonetheless, particularly if the extended relation is being consumed by something that does no computation itself, such as a screen or a plain HTML form.

That is, if a tuple makes business sense without those projected-out attributes, then that tuple alone should be the base schema. If later more information comes to hand that 'extends' that tuple, then that should be in a separate relvar with a Foreign Key constraint. (And probably a timestamp or audit trail to explain when/how the info came to hand.)

Please explain further.

 

Quote from Dave Voorhis on July 4, 2019, 9:28 pm

I don't know about other threads where that may have happened, but in this one -- and speaking for myself, mainly -- I suspect we're trying to avoid the topic turning into the 14th annual launch of the update-through-views debate.

My view is that application programs should deal only with abstractions of its external connections. Then the program can be connected to different concrete entities, or those entities may be modified, without changing the code.

A relvar then is an abstraction, and might actually be a view. So of course views are updateable (but not in the way Date talks about). The challenge then is to devise a suitable abstraction mechanism so that the systems designer (but not necessarily the application programmer) can specify the nature of that abstraction and the meaning of each type of CRUD operation on that entity (or pseudo variable).

TTM hints at understanding this concept with its application and virtual relvars, but falls well short of what I have in mind.

 

Andl - A New Database Language - andl.org
Quote from dandl on July 5, 2019, 2:29 am

My view is that application programs should deal only with abstractions of its external connections. Then the program can be connected to different concrete entities, or those entities may be modified, without changing the code.

That is the case in Cobol, where the various files to which a program refers are known only by internal identifiers, and the mapping between those and actual external files is itself external to the code properly so called; it appears in the environment-division, whose job it is to track such mappings.   Cobol programs even have an internal file called working-storage that is not mapped to anything external, but it can only have one record.  Such a mapping facility could be provided in connection with TD; it is already necessary to have an external binding of the unnamed singleton database of a TD program to some external persistent object.

In the pre-Cobol business language Fact, the mapping was entirely external, and there could be any number of internal-only files.  Fact also, unlike Cobol, allows a non-fixed number of repetitions of fields and groups of fields in its files.  I'm speculating about an implementation, probably the first one since 1960.  I'm thinking that Fact's data files will be in XML, its tables in SQLite, and its output reports in CSV.