TTM without tuples
Quote from David Livingstone on July 15, 2021, 10:03 pmQuote from dandl on July 14, 2021, 12:38 amThat wasn't my question. I want to know:
- Would a formal logical model include views?
- Do you have such a model you can show us?
- How would that resolve the problem of view updating?
IMO views are a kind of query expession, expressed as a WFF, not part of the underlying model. View updating is relational assignment, in which WFFs are evaluated and assigned to one or more relvars. The task of constructing those WFFs falls to the user/developer and cannot in general be derived mechanically from the logical model. Please confirm or refute.
-----------
I don't think it's as straightforward as "confirm or refute".
I don't think the relational model ON ITS OWN needs views with UPDATING. It would be sufficient to be able to assign algebraic expressions (WFFs) to relvar names. The value of a relvar is then determined by executing its relational expression. There is NO view UPDATING here.
However the original DATABASE MODEL required UPDATING. (This has long since been de facto abandoned by SQL etc). The original purpose of DBs was to be a solution to the problem of multiple applications having a lot of data in common. The intent of the solution was to store ONE copy of ALL the data used by ALL the applications. To achieve that, the DB had to comprise a coherent set of data, a coherent set of relvars if you're using the relational model. The consequence was that for each application, you had to map data back from its DB form to the form it was required by an application; for each application using the DB to store its data. The DBMS had to perform the back-mapping.
So if you want to design your DB, and the 'DB' of each application, in terms of relvars, then the relational model has to ADDITIONALLY incorporate a valid form of back-mapping for the DBMS to implement. The purpose of views was to provide the back-mapping, or view updating as we now call it. The snag is that we do not yet have a universally-agreed form of relational back-mapping/view updating. (Hence it's de facto abandonment by SQL etc).
The only solution that I know of is for each view to ADDITIONALLY have a set of update assignments, probably one per underlying relvar, such that when the view is updated, these assignments are automatically executed by the DBMS to update the underlying DB relvars appropriately. However it would be part of the DB designer's job to provide the update assignments, because these assignments would have to take account of the DESIGNS of the DB (DB Schema) and the designs of the application 'DBs' (DB Subschemas). Hence this is a manual job, not an automated job which consequently can be incorporated into the relational model. So it's not what people ideally want, and not what Chris Date wants.
That wasn't my question. I want to know:
- Would a formal logical model include views?
- Do you have such a model you can show us?
- How would that resolve the problem of view updating?
IMO views are a kind of query expession, expressed as a WFF, not part of the underlying model. View updating is relational assignment, in which WFFs are evaluated and assigned to one or more relvars. The task of constructing those WFFs falls to the user/developer and cannot in general be derived mechanically from the logical model. Please confirm or refute.
-----------
I don't think it's as straightforward as "confirm or refute".
I don't think the relational model ON ITS OWN needs views with UPDATING. It would be sufficient to be able to assign algebraic expressions (WFFs) to relvar names. The value of a relvar is then determined by executing its relational expression. There is NO view UPDATING here.
However the original DATABASE MODEL required UPDATING. (This has long since been de facto abandoned by SQL etc). The original purpose of DBs was to be a solution to the problem of multiple applications having a lot of data in common. The intent of the solution was to store ONE copy of ALL the data used by ALL the applications. To achieve that, the DB had to comprise a coherent set of data, a coherent set of relvars if you're using the relational model. The consequence was that for each application, you had to map data back from its DB form to the form it was required by an application; for each application using the DB to store its data. The DBMS had to perform the back-mapping.
So if you want to design your DB, and the 'DB' of each application, in terms of relvars, then the relational model has to ADDITIONALLY incorporate a valid form of back-mapping for the DBMS to implement. The purpose of views was to provide the back-mapping, or view updating as we now call it. The snag is that we do not yet have a universally-agreed form of relational back-mapping/view updating. (Hence it's de facto abandonment by SQL etc).
The only solution that I know of is for each view to ADDITIONALLY have a set of update assignments, probably one per underlying relvar, such that when the view is updated, these assignments are automatically executed by the DBMS to update the underlying DB relvars appropriately. However it would be part of the DB designer's job to provide the update assignments, because these assignments would have to take account of the DESIGNS of the DB (DB Schema) and the designs of the application 'DBs' (DB Subschemas). Hence this is a manual job, not an automated job which consequently can be incorporated into the relational model. So it's not what people ideally want, and not what Chris Date wants.
Quote from dandl on July 16, 2021, 12:50 amQuote from David Livingstone on July 15, 2021, 10:03 pmQuote from dandl on July 14, 2021, 12:38 amThat wasn't my question. I want to know:
- Would a formal logical model include views?
- Do you have such a model you can show us?
- How would that resolve the problem of view updating?
IMO views are a kind of query expession, expressed as a WFF, not part of the underlying model. View updating is relational assignment, in which WFFs are evaluated and assigned to one or more relvars. The task of constructing those WFFs falls to the user/developer and cannot in general be derived mechanically from the logical model. Please confirm or refute.
-----------
I don't think it's as straightforward as "confirm or refute".
I don't think the relational model ON ITS OWN needs views with UPDATING. It would be sufficient to be able to assign algebraic expressions (WFFs) to relvar names. The value of a relvar is then determined by executing its relational expression. There is NO view UPDATING here.
However the original DATABASE MODEL required UPDATING. (This has long since been de facto abandoned by SQL etc). The original purpose of DBs was to be a solution to the problem of multiple applications having a lot of data in common. The intent of the solution was to store ONE copy of ALL the data used by ALL the applications. To achieve that, the DB had to comprise a coherent set of data, a coherent set of relvars if you're using the relational model. The consequence was that for each application, you had to map data back from its DB form to the form it was required by an application; for each application using the DB to store its data. The DBMS had to perform the back-mapping.
So if you want to design your DB, and the 'DB' of each application, in terms of relvars, then the relational model has to ADDITIONALLY incorporate a valid form of back-mapping for the DBMS to implement. The purpose of views was to provide the back-mapping, or view updating as we now call it. The snag is that we do not yet have a universally-agreed form of relational back-mapping/view updating. (Hence it's de facto abandonment by SQL etc).
The only solution that I know of is for each view to ADDITIONALLY have a set of update assignments, probably one per underlying relvar, such that when the view is updated, these assignments are automatically executed by the DBMS to update the underlying DB relvars appropriately. However it would be part of the DB designer's job to provide the update assignments, because these assignments would have to take account of the DESIGNS of the DB (DB Schema) and the designs of the application 'DBs' (DB Subschemas). Hence this is a manual job, not an automated job which consequently can be incorporated into the relational model. So it's not what people ideally want, and not what Chris Date wants.
I'm not sure I fully subscribe to this view of the history, but as to the last para, we are in furious agreement. And that means in effect that there needs to be a 'DBMS sub-language' to specify where, when and how that happens, as mentioned in another post.
Quote from David Livingstone on July 15, 2021, 10:03 pmQuote from dandl on July 14, 2021, 12:38 amThat wasn't my question. I want to know:
- Would a formal logical model include views?
- Do you have such a model you can show us?
- How would that resolve the problem of view updating?
IMO views are a kind of query expession, expressed as a WFF, not part of the underlying model. View updating is relational assignment, in which WFFs are evaluated and assigned to one or more relvars. The task of constructing those WFFs falls to the user/developer and cannot in general be derived mechanically from the logical model. Please confirm or refute.
-----------
I don't think it's as straightforward as "confirm or refute".
I don't think the relational model ON ITS OWN needs views with UPDATING. It would be sufficient to be able to assign algebraic expressions (WFFs) to relvar names. The value of a relvar is then determined by executing its relational expression. There is NO view UPDATING here.
However the original DATABASE MODEL required UPDATING. (This has long since been de facto abandoned by SQL etc). The original purpose of DBs was to be a solution to the problem of multiple applications having a lot of data in common. The intent of the solution was to store ONE copy of ALL the data used by ALL the applications. To achieve that, the DB had to comprise a coherent set of data, a coherent set of relvars if you're using the relational model. The consequence was that for each application, you had to map data back from its DB form to the form it was required by an application; for each application using the DB to store its data. The DBMS had to perform the back-mapping.
So if you want to design your DB, and the 'DB' of each application, in terms of relvars, then the relational model has to ADDITIONALLY incorporate a valid form of back-mapping for the DBMS to implement. The purpose of views was to provide the back-mapping, or view updating as we now call it. The snag is that we do not yet have a universally-agreed form of relational back-mapping/view updating. (Hence it's de facto abandonment by SQL etc).
The only solution that I know of is for each view to ADDITIONALLY have a set of update assignments, probably one per underlying relvar, such that when the view is updated, these assignments are automatically executed by the DBMS to update the underlying DB relvars appropriately. However it would be part of the DB designer's job to provide the update assignments, because these assignments would have to take account of the DESIGNS of the DB (DB Schema) and the designs of the application 'DBs' (DB Subschemas). Hence this is a manual job, not an automated job which consequently can be incorporated into the relational model. So it's not what people ideally want, and not what Chris Date wants.
I'm not sure I fully subscribe to this view of the history, but as to the last para, we are in furious agreement. And that means in effect that there needs to be a 'DBMS sub-language' to specify where, when and how that happens, as mentioned in another post.
Quote from Dave Voorhis on July 16, 2021, 10:59 amQuote from David Livingstone on July 15, 2021, 10:03 pmQuote from dandl on July 14, 2021, 12:38 amThat wasn't my question. I want to know:
- Would a formal logical model include views?
- Do you have such a model you can show us?
- How would that resolve the problem of view updating?
IMO views are a kind of query expession, expressed as a WFF, not part of the underlying model. View updating is relational assignment, in which WFFs are evaluated and assigned to one or more relvars. The task of constructing those WFFs falls to the user/developer and cannot in general be derived mechanically from the logical model. Please confirm or refute.
-----------
I don't think it's as straightforward as "confirm or refute".
I don't think the relational model ON ITS OWN needs views with UPDATING. It would be sufficient to be able to assign algebraic expressions (WFFs) to relvar names. The value of a relvar is then determined by executing its relational expression. There is NO view UPDATING here.
However the original DATABASE MODEL required UPDATING. (This has long since been de facto abandoned by SQL etc). The original purpose of DBs was to be a solution to the problem of multiple applications having a lot of data in common. The intent of the solution was to store ONE copy of ALL the data used by ALL the applications. To achieve that, the DB had to comprise a coherent set of data, a coherent set of relvars if you're using the relational model. The consequence was that for each application, you had to map data back from its DB form to the form it was required by an application; for each application using the DB to store its data. The DBMS had to perform the back-mapping.
So if you want to design your DB, and the 'DB' of each application, in terms of relvars, then the relational model has to ADDITIONALLY incorporate a valid form of back-mapping for the DBMS to implement. The purpose of views was to provide the back-mapping, or view updating as we now call it. The snag is that we do not yet have a universally-agreed form of relational back-mapping/view updating. (Hence it's de facto abandonment by SQL etc).
Having many applications sharing one database is pervasive in the enterprise world.
The original notion was that each application would see the database a different way. Say, Application A would have a table X and Application B would have a table Y, where X is actually a view involving database base tables P, Q and R and Y is a view based on tables Q, R, S and T.
Toward that end (presumably), many modern SQL implementations do support update-through-views, at least to the extent that it can be implemented in straightforward manner. See, for example, the 'Updatable Views' section of https://docs.microsoft.com/en-us/sql/t-sql/statements/create-view-transact-sql?view=sql-server-ver15
But almost nobody does it.
Aside from some negligible differences in (usually) security access which are (usually) handled through security mechanisms, different applications accessing the same database generally don't need or want different views of the shared base tables. A Customer or Invoice or Order or whatever is the same in every application. Update-through-views is almost never seen in the wild, and almost universally deprecated as being unwieldy, messy, and needlessly complex.
The only solution that I know of is for each view to ADDITIONALLY have a set of update assignments, probably one per underlying relvar, such that when the view is updated, these assignments are automatically executed by the DBMS to update the underlying DB relvars appropriately. However it would be part of the DB designer's job to provide the update assignments, because these assignments would have to take account of the DESIGNS of the DB (DB Schema) and the designs of the application 'DBs' (DB Subschemas). Hence this is a manual job, not an automated job which consequently can be incorporated into the relational model. So it's not what people ideally want, and not what Chris Date wants.
INSTEAD OF triggers (or equivalent, depending on SQL DBMS vendor) are the usual SQL mechanism for handling this on the rare occasions that it's needed. It works, and it has the benefit of being explicit -- it does not require memorisation of some byzantine implicit update-through-views rules to appreciate what the code is doing.
Quote from David Livingstone on July 15, 2021, 10:03 pmQuote from dandl on July 14, 2021, 12:38 amThat wasn't my question. I want to know:
- Would a formal logical model include views?
- Do you have such a model you can show us?
- How would that resolve the problem of view updating?
IMO views are a kind of query expession, expressed as a WFF, not part of the underlying model. View updating is relational assignment, in which WFFs are evaluated and assigned to one or more relvars. The task of constructing those WFFs falls to the user/developer and cannot in general be derived mechanically from the logical model. Please confirm or refute.
-----------
I don't think it's as straightforward as "confirm or refute".
I don't think the relational model ON ITS OWN needs views with UPDATING. It would be sufficient to be able to assign algebraic expressions (WFFs) to relvar names. The value of a relvar is then determined by executing its relational expression. There is NO view UPDATING here.
However the original DATABASE MODEL required UPDATING. (This has long since been de facto abandoned by SQL etc). The original purpose of DBs was to be a solution to the problem of multiple applications having a lot of data in common. The intent of the solution was to store ONE copy of ALL the data used by ALL the applications. To achieve that, the DB had to comprise a coherent set of data, a coherent set of relvars if you're using the relational model. The consequence was that for each application, you had to map data back from its DB form to the form it was required by an application; for each application using the DB to store its data. The DBMS had to perform the back-mapping.
So if you want to design your DB, and the 'DB' of each application, in terms of relvars, then the relational model has to ADDITIONALLY incorporate a valid form of back-mapping for the DBMS to implement. The purpose of views was to provide the back-mapping, or view updating as we now call it. The snag is that we do not yet have a universally-agreed form of relational back-mapping/view updating. (Hence it's de facto abandonment by SQL etc).
Having many applications sharing one database is pervasive in the enterprise world.
The original notion was that each application would see the database a different way. Say, Application A would have a table X and Application B would have a table Y, where X is actually a view involving database base tables P, Q and R and Y is a view based on tables Q, R, S and T.
Toward that end (presumably), many modern SQL implementations do support update-through-views, at least to the extent that it can be implemented in straightforward manner. See, for example, the 'Updatable Views' section of https://docs.microsoft.com/en-us/sql/t-sql/statements/create-view-transact-sql?view=sql-server-ver15
But almost nobody does it.
Aside from some negligible differences in (usually) security access which are (usually) handled through security mechanisms, different applications accessing the same database generally don't need or want different views of the shared base tables. A Customer or Invoice or Order or whatever is the same in every application. Update-through-views is almost never seen in the wild, and almost universally deprecated as being unwieldy, messy, and needlessly complex.
The only solution that I know of is for each view to ADDITIONALLY have a set of update assignments, probably one per underlying relvar, such that when the view is updated, these assignments are automatically executed by the DBMS to update the underlying DB relvars appropriately. However it would be part of the DB designer's job to provide the update assignments, because these assignments would have to take account of the DESIGNS of the DB (DB Schema) and the designs of the application 'DBs' (DB Subschemas). Hence this is a manual job, not an automated job which consequently can be incorporated into the relational model. So it's not what people ideally want, and not what Chris Date wants.
INSTEAD OF triggers (or equivalent, depending on SQL DBMS vendor) are the usual SQL mechanism for handling this on the rare occasions that it's needed. It works, and it has the benefit of being explicit -- it does not require memorisation of some byzantine implicit update-through-views rules to appreciate what the code is doing.
Quote from David Livingstone on July 18, 2021, 4:03 pmAnother 2 days have elapsed. Has at least a broad consensus on 'view updating' arrived ?
I wouldn't presume to comment on a 'DBMS sub-language', if only because I would've thought it would depend somewhat on the character of the language underlying it.
INSTEAD OF triggers certainly seem to provide a mechanism to achieve what I had in mind. Again I wouldn't presume to comment on INSTEAD OF triggers (or just triggers) more generally.
Would you say, Dave, that "Having many applications sharing one database is pervasive in the enterprise world" (quote) is due, at least to a reasonable degree, on the availability of INSTEAD OF triggers ? I ask because my impression was that in the (more distant) past there were a lot of 'data silos' where that hadn't happened. Perhaps 'INSTEAD OF triggers' have solved the problem ?
Another 2 days have elapsed. Has at least a broad consensus on 'view updating' arrived ?
I wouldn't presume to comment on a 'DBMS sub-language', if only because I would've thought it would depend somewhat on the character of the language underlying it.
INSTEAD OF triggers certainly seem to provide a mechanism to achieve what I had in mind. Again I wouldn't presume to comment on INSTEAD OF triggers (or just triggers) more generally.
Would you say, Dave, that "Having many applications sharing one database is pervasive in the enterprise world" (quote) is due, at least to a reasonable degree, on the availability of INSTEAD OF triggers ? I ask because my impression was that in the (more distant) past there were a lot of 'data silos' where that hadn't happened. Perhaps 'INSTEAD OF triggers' have solved the problem ?
Quote from Dave Voorhis on July 18, 2021, 9:07 pmQuote from David Livingstone on July 18, 2021, 4:03 pmAnother 2 days have elapsed. Has at least a broad consensus on 'view updating' arrived ?
I wouldn't presume to comment on a 'DBMS sub-language', if only because I would've thought it would depend somewhat on the character of the language underlying it.
INSTEAD OF triggers certainly seem to provide a mechanism to achieve what I had in mind. Again I wouldn't presume to comment on INSTEAD OF triggers (or just triggers) more generally.
Would you say, Dave, that "Having many applications sharing one database is pervasive in the enterprise world" (quote) is due, at least to a reasonable degree, on the availability of INSTEAD OF triggers ?No. I can count on one hand the number of times I've used INSTEAD OF (or some equivalent) to do anything remotely like update-through-views, and I can count on the other hand the number of times I've seen someone else use it.
That's quite different from triggers in general, which in some environments are used extensively for various purposes but nothing like update-through-views.
I doubt any of the uses of INSTEAD OF that I've used (or seen used) really warranted it, and I imagine that had the code been subjected to a group code review it would have been shot down. (Quite rightfully, I might add.)
Having many applications share a database is simply the nature of many enterprises, which inevitably generate numerous applications that all do various somethings with core data.
I ask because my impression was that in the (more distant) past there were a lot of 'data silos' where that hadn't happened. Perhaps 'INSTEAD OF triggers' have solved the problem ?
Often as not, the data is in silos and shuttled from silo to silo and system to system (and thus database to database) by various means. But each silo's database may have dozens -- maybe many dozens -- of applications using it.
There's nothing that would conceptually prevent storing Customers, or Invoices, or Staff, or Policies, or Sales, or Estimates (and/or whatever core entity data the organisation cares about) all in one enterprise database (setting aside technical issues) and sharing that with all the applications. But that doesn't happen for reasons that are rarely technical, conceptual, or even (as one might suspect) political. It's just what organically emerges when big organisations have a lot of people and a lot of projects all trying to get things done at once (i.e., typical big organisation mayhem) plus the external influence of an IT industry that goes through regular periodic phases of centralisation (or at least technical cohesion) being encouraged (avoid silos!) or discouraged (use microservices!)
Quote from David Livingstone on July 18, 2021, 4:03 pmAnother 2 days have elapsed. Has at least a broad consensus on 'view updating' arrived ?
I wouldn't presume to comment on a 'DBMS sub-language', if only because I would've thought it would depend somewhat on the character of the language underlying it.
INSTEAD OF triggers certainly seem to provide a mechanism to achieve what I had in mind. Again I wouldn't presume to comment on INSTEAD OF triggers (or just triggers) more generally.
Would you say, Dave, that "Having many applications sharing one database is pervasive in the enterprise world" (quote) is due, at least to a reasonable degree, on the availability of INSTEAD OF triggers ?
No. I can count on one hand the number of times I've used INSTEAD OF (or some equivalent) to do anything remotely like update-through-views, and I can count on the other hand the number of times I've seen someone else use it.
That's quite different from triggers in general, which in some environments are used extensively for various purposes but nothing like update-through-views.
I doubt any of the uses of INSTEAD OF that I've used (or seen used) really warranted it, and I imagine that had the code been subjected to a group code review it would have been shot down. (Quite rightfully, I might add.)
Having many applications share a database is simply the nature of many enterprises, which inevitably generate numerous applications that all do various somethings with core data.
I ask because my impression was that in the (more distant) past there were a lot of 'data silos' where that hadn't happened. Perhaps 'INSTEAD OF triggers' have solved the problem ?
Often as not, the data is in silos and shuttled from silo to silo and system to system (and thus database to database) by various means. But each silo's database may have dozens -- maybe many dozens -- of applications using it.
There's nothing that would conceptually prevent storing Customers, or Invoices, or Staff, or Policies, or Sales, or Estimates (and/or whatever core entity data the organisation cares about) all in one enterprise database (setting aside technical issues) and sharing that with all the applications. But that doesn't happen for reasons that are rarely technical, conceptual, or even (as one might suspect) political. It's just what organically emerges when big organisations have a lot of people and a lot of projects all trying to get things done at once (i.e., typical big organisation mayhem) plus the external influence of an IT industry that goes through regular periodic phases of centralisation (or at least technical cohesion) being encouraged (avoid silos!) or discouraged (use microservices!)
Quote from p c on July 18, 2021, 10:45 pmQuote from David Livingstone on July 15, 2021, 10:03 pmQuote from dandl on July 14, 2021, 12:38 amThat wasn't my question. I want to know:
- Would a formal logical model include views?
- Do you have such a model you can show us?
- How would that resolve the problem of view updating?
IMO views are a kind of query expession, expressed as a WFF, not part of the underlying model. View updating is relational assignment, in which WFFs are evaluated and assigned to one or more relvars. The task of constructing those WFFs falls to the user/developer and cannot in general be derived mechanically from the logical model. Please confirm or refute.
-----------
I don't think it's as straightforward as "confirm or refute".
I don't think the relational model ON ITS OWN needs views with UPDATING. It would be sufficient to be able to assign algebraic expressions (WFFs) to relvar names. The value of a relvar is then determined by executing its relational expression. There is NO view UPDATING here.
However the original DATABASE MODEL required UPDATING. (This has long since been de facto abandoned by SQL etc). The original purpose of DBs was to be a solution to the problem of multiple applications having a lot of data in common. The intent of the solution was to store ONE copy of ALL the data used by ALL the applications. To achieve that, the DB had to comprise a coherent set of data, a coherent set of relvars if you're using the relational model. The consequence was that for each application, you had to map data back from its DB form to the form it was required by an application; for each application using the DB to store its data. The DBMS had to perform the back-mapping.
So if you want to design your DB, and the 'DB' of each application, in terms of relvars, then the relational model has to ADDITIONALLY incorporate a valid form of back-mapping for the DBMS to implement. The purpose of views was to provide the back-mapping, or view updating as we now call it. The snag is that we do not yet have a universally-agreed form of relational back-mapping/view updating. (Hence it's de facto abandonment by SQL etc).
The only solution that I know of is for each view to ADDITIONALLY have a set of update assignments, probably one per underlying relvar, such that when the view is updated, these assignments are automatically executed by the DBMS to update the underlying DB relvars appropriately. However it would be part of the DB designer's job to provide the update assignments, because these assignments would have to take account of the DESIGNS of the DB (DB Schema) and the designs of the application 'DBs' (DB Subschemas). Hence this is a manual job, not an automated job which consequently can be incorporated into the relational model. So it's not what people ideally want, and not what Chris Date wants.
It says here that there will never be such a broad consensus and also that such a broad consensus is not necessary. For one thing, it's rarely clear just exactly what situation people are talking about when they use the topic "the relational model". They might be talking about a language that implements a finite relational data model or they might be talking about a Turing language, two very different and distinct things. A big problem with such discussions is almost always that they are implementation driven, what is possible is often not decided by what is needed or intended but rather by what techniques different individual people happen to know so implementation drives or dictates purpose rather than the other way around. One way to think about what all the differences are is to revisit Codd's original idea.
Sometime in the later years, I forget exactly where, he wrote that his consistent purpose had been the semantics of information communication. In 1970 The landmark paper was titled "A relational model of data…", not "The relational model of data…". By '71 or '72 he was writing "The relational model". Anybody who didn't read the 1970 paper might conclude that there is only one relational model.
Given his stated purpose, why wasn't the first paper titled "a relational model of information"? Like a number of other inventive thinkers of his generation such as Mcluhan and DeBono he was and still is widely misunderstood. He hardly ever wrote about language except for his tuple calculus language, usually only the semantics of languages. One of the rare times he ever prescribed syntax was in the early completeness paper when he defined union compatibility and relative complement, only the former of which concepts survived in SQL languages and later efforts such as D languages.
Of course none of those can be characterized as a distinct data language. His way of prescribing data language in general was to determine its semantics according to a data model consisting of logical but implementable structures, operations on and within themand constraints on both of the structures and the operations with the understanding that all three elements had both logical and implementable interpretations. This is why implementers usually have trouble with his papers, they often don't understand that he was only using the data model as a way to constrain possible implementations rather than as a way to define those implementations.
That is also why as far as I know he didn't ever try to define an information model. The point was not to define information but rather to constrain an information model according to a data model. Like the other original thinkers above he wasn't trying to define thought or electronic mediums but rather just how they affect human activities. When he mentioned "an applied predicate calculus" he was talking about a limited application, not a general application.
A productive way to approach view updating is to not try to generalize too quickly. It is more productive to start with something that is precisely defined in both logical and language terms such as relative complement and see how many situations it can be applied to. For example it appears that with a suitable definition of absolute, constrained absolute and relative complements, complement negation logically entails relation value insertion. The actualization of this is much simpler than entertaining the ambiguities of disjunctive assertions.
Impressive as Appendix A is, it is a little surprising that it focuses on eliminating operations rather than starting with a minimum number and then synthesizing ones that are specifically useful not only for logical purposes that also imply procedural implementations that are considered desirable, for example when the predicate and propositional entailment property is understood as material implication, ie, the symbolic expression A→B is logically equivalent to ¬(A∧¬B), in other words can be defined in terms only of negation and conjunction. But in isolation it doesn't enforce the relative complement limitation, ie, that overall data model constraint. It is also logically equivalent to (B∨¬A), which is enforceable in isolation provided disjunction is understood as set union. So it would seem that in addition to unions, negations and join relation characterizations, a logically valid kind of relation is also entailment.
The obvious motivation for relative complements was to enforce another requirement of the original data model was finite data, not infinite. Its syntax prescribed that a conjunction operator must always precede a negation operator. This can be seen as guaranteeing that such a relational data model allows instances of the same schema definitions that have been populated with different relation values to be logically complete and consistent with respect to each other. It would seem to be almost a universal requirement of database users that the schema design be applicable to data changes over time.
Appendix A and TTM also give a relation definition that is not directly transferable to classical logic terms because includes headings as well as values. TTM headings are very much part of the original requirement of content addressability. In this approach a database relation always has a heading and it remains a relation even when its value is an empty set. So care must be taken when transferring the semantics of logical expressions to procedural implementations. For example, formal constraint representation needs to include variables or propositions that represent headings and or attributes.
Leaving headings and union compatibility aside for the moment it is interesting to combine entailment relations with relative complement. A general relational data model constraint should probably be this combination of entailment relations:
(A∧B∧C)→¬(A∧B)→(A∧B∧¬C).
Actually, it doesn't really need to be stated except to help the reader's comprehension because it is a tautology. (Actually, understood as operators the two implication symbols could be replaced with equivalent symbols and the expression would still be a tautology. Note that only middle parenthesized expression needs parentheses when the usual operator precedence and associativity conventions are used. The other parentheses are merely for human readability.)
That tautological model constraint has multiple procedural interpretations assuming the names A and B and possibly C represent relation values. It can be read from left to right such that the first parenthesized expression represents a situation, the second represents an intention and the conjunctions of the third represent a logical conclusion and are applied as a sequence of Appendix A joins. It is a way for purpose to drive implementation.
When things are approached this way where a situation and purpose are formally and provably defined, your question about general agreement becomes moot, at least for it the general problem of join deletion. Since such expressions can be mechanically and efficiently proved they have far more general utility then ordinary English textual documentation no matter how formal somebody calls that text. For example being a tautology it has the advantage of making explicit just exactly which classical tautologies apply and one would think every time a database instance is transported to a different machine that instance would include a copy of its catalog which would in turn contained the tautologies it depends on which is one obvious requirement of information communication in general.
It is a mystery why whenever join deletion is brought up somebody is likely to ask well what about all views? Or, don't projections, aggregates etc need to be deletable too plus need to be updateable as well? For example, in what I presume to be an attempt at extreme generality the appendix A "OR" disjunction operator it seems to be often understood as something that a data language must directly support in recordable form. Yet when end users think about disjunction perhaps not by using that word literally it is usually obvious that they are thinking about either hypothetical situations or a manipulation of data that has already been recorded in conjunctive form. Since conjunction always logically implies or entails disjunction but not the reverse, it would seem completely unnecessary for what Codd called noninferential data systems to materialize it in any way. When the central theme of the 1970 paper as expressed by the very first sentence is followed doing so becomes simply a matter of the internal data language interface providing recorded data in the form of relations such as defined by TTM to a distinct component or language that doesn't necessarily have the logical completeness and consistency properties such as a Turing language. At that level disjunctive results based on disjunctions of relations rather than disjunctive relations can be provided to users without any logical complications because what the internal noninferential language interface will accept for recording is only non- disjunctive relations.
(Several times Codd publicly acknowledged a 1988 paper HW Buff which advocated a reformulation of his view updateability rule. It is actually a very short paper and uses an example of disjunction insertion and also cites proofs involving mathematical group theory. With the approach above it seems to be moot. However in his 1990 book Codd does appear to disallow disjunction insertion. It's not quite clear whether he says that simply because he thought his sub language good logically be used to implement such or whether he was just trying to emphasize that it shouldn't be used that way. But otherwise the book was consistent with the earlier assumption that joins are always deletable insertable and updatable.)
It's also not clear why users would ever want to insert extensions of proper projections of relations. One or more relations can logically entail a given projection value can't logically entail every unrecorded projection of a relation unless there are such constraints on the target relation that the values are inferable by existing data and thus don't need to be recorded in the first place. The situation with aggregates seems similar.
If the logical validity of join updates is accepted then it becomes pretty clear that from the data model point of view there is no real reason to distinguish base or real relations and virtual relations or views. They all become the same thing.
In a previous post you mentioned "back-mapping". Codd mentioned an idea he called backtracing. Thinking in terms of logical equivalents in other words conjunctions of entailments including entailments that entail other entailments recursing as far as necessary seems to be just as useful an approach as the above brief description and perhaps potentially more so.
Quote from David Livingstone on July 15, 2021, 10:03 pmQuote from dandl on July 14, 2021, 12:38 amThat wasn't my question. I want to know:
- Would a formal logical model include views?
- Do you have such a model you can show us?
- How would that resolve the problem of view updating?
IMO views are a kind of query expession, expressed as a WFF, not part of the underlying model. View updating is relational assignment, in which WFFs are evaluated and assigned to one or more relvars. The task of constructing those WFFs falls to the user/developer and cannot in general be derived mechanically from the logical model. Please confirm or refute.
-----------
I don't think it's as straightforward as "confirm or refute".
I don't think the relational model ON ITS OWN needs views with UPDATING. It would be sufficient to be able to assign algebraic expressions (WFFs) to relvar names. The value of a relvar is then determined by executing its relational expression. There is NO view UPDATING here.
However the original DATABASE MODEL required UPDATING. (This has long since been de facto abandoned by SQL etc). The original purpose of DBs was to be a solution to the problem of multiple applications having a lot of data in common. The intent of the solution was to store ONE copy of ALL the data used by ALL the applications. To achieve that, the DB had to comprise a coherent set of data, a coherent set of relvars if you're using the relational model. The consequence was that for each application, you had to map data back from its DB form to the form it was required by an application; for each application using the DB to store its data. The DBMS had to perform the back-mapping.
So if you want to design your DB, and the 'DB' of each application, in terms of relvars, then the relational model has to ADDITIONALLY incorporate a valid form of back-mapping for the DBMS to implement. The purpose of views was to provide the back-mapping, or view updating as we now call it. The snag is that we do not yet have a universally-agreed form of relational back-mapping/view updating. (Hence it's de facto abandonment by SQL etc).
The only solution that I know of is for each view to ADDITIONALLY have a set of update assignments, probably one per underlying relvar, such that when the view is updated, these assignments are automatically executed by the DBMS to update the underlying DB relvars appropriately. However it would be part of the DB designer's job to provide the update assignments, because these assignments would have to take account of the DESIGNS of the DB (DB Schema) and the designs of the application 'DBs' (DB Subschemas). Hence this is a manual job, not an automated job which consequently can be incorporated into the relational model. So it's not what people ideally want, and not what Chris Date wants.
It says here that there will never be such a broad consensus and also that such a broad consensus is not necessary. For one thing, it's rarely clear just exactly what situation people are talking about when they use the topic "the relational model". They might be talking about a language that implements a finite relational data model or they might be talking about a Turing language, two very different and distinct things. A big problem with such discussions is almost always that they are implementation driven, what is possible is often not decided by what is needed or intended but rather by what techniques different individual people happen to know so implementation drives or dictates purpose rather than the other way around. One way to think about what all the differences are is to revisit Codd's original idea.
Sometime in the later years, I forget exactly where, he wrote that his consistent purpose had been the semantics of information communication. In 1970 The landmark paper was titled "A relational model of data…", not "The relational model of data…". By '71 or '72 he was writing "The relational model". Anybody who didn't read the 1970 paper might conclude that there is only one relational model.
Given his stated purpose, why wasn't the first paper titled "a relational model of information"? Like a number of other inventive thinkers of his generation such as Mcluhan and DeBono he was and still is widely misunderstood. He hardly ever wrote about language except for his tuple calculus language, usually only the semantics of languages. One of the rare times he ever prescribed syntax was in the early completeness paper when he defined union compatibility and relative complement, only the former of which concepts survived in SQL languages and later efforts such as D languages.
Of course none of those can be characterized as a distinct data language. His way of prescribing data language in general was to determine its semantics according to a data model consisting of logical but implementable structures, operations on and within themand constraints on both of the structures and the operations with the understanding that all three elements had both logical and implementable interpretations. This is why implementers usually have trouble with his papers, they often don't understand that he was only using the data model as a way to constrain possible implementations rather than as a way to define those implementations.
That is also why as far as I know he didn't ever try to define an information model. The point was not to define information but rather to constrain an information model according to a data model. Like the other original thinkers above he wasn't trying to define thought or electronic mediums but rather just how they affect human activities. When he mentioned "an applied predicate calculus" he was talking about a limited application, not a general application.
A productive way to approach view updating is to not try to generalize too quickly. It is more productive to start with something that is precisely defined in both logical and language terms such as relative complement and see how many situations it can be applied to. For example it appears that with a suitable definition of absolute, constrained absolute and relative complements, complement negation logically entails relation value insertion. The actualization of this is much simpler than entertaining the ambiguities of disjunctive assertions.
Impressive as Appendix A is, it is a little surprising that it focuses on eliminating operations rather than starting with a minimum number and then synthesizing ones that are specifically useful not only for logical purposes that also imply procedural implementations that are considered desirable, for example when the predicate and propositional entailment property is understood as material implication, ie, the symbolic expression A→B is logically equivalent to ¬(A∧¬B), in other words can be defined in terms only of negation and conjunction. But in isolation it doesn't enforce the relative complement limitation, ie, that overall data model constraint. It is also logically equivalent to (B∨¬A), which is enforceable in isolation provided disjunction is understood as set union. So it would seem that in addition to unions, negations and join relation characterizations, a logically valid kind of relation is also entailment.
The obvious motivation for relative complements was to enforce another requirement of the original data model was finite data, not infinite. Its syntax prescribed that a conjunction operator must always precede a negation operator. This can be seen as guaranteeing that such a relational data model allows instances of the same schema definitions that have been populated with different relation values to be logically complete and consistent with respect to each other. It would seem to be almost a universal requirement of database users that the schema design be applicable to data changes over time.
Appendix A and TTM also give a relation definition that is not directly transferable to classical logic terms because includes headings as well as values. TTM headings are very much part of the original requirement of content addressability. In this approach a database relation always has a heading and it remains a relation even when its value is an empty set. So care must be taken when transferring the semantics of logical expressions to procedural implementations. For example, formal constraint representation needs to include variables or propositions that represent headings and or attributes.
Leaving headings and union compatibility aside for the moment it is interesting to combine entailment relations with relative complement. A general relational data model constraint should probably be this combination of entailment relations:
(A∧B∧C)→¬(A∧B)→(A∧B∧¬C).
Actually, it doesn't really need to be stated except to help the reader's comprehension because it is a tautology. (Actually, understood as operators the two implication symbols could be replaced with equivalent symbols and the expression would still be a tautology. Note that only middle parenthesized expression needs parentheses when the usual operator precedence and associativity conventions are used. The other parentheses are merely for human readability.)
That tautological model constraint has multiple procedural interpretations assuming the names A and B and possibly C represent relation values. It can be read from left to right such that the first parenthesized expression represents a situation, the second represents an intention and the conjunctions of the third represent a logical conclusion and are applied as a sequence of Appendix A joins. It is a way for purpose to drive implementation.
When things are approached this way where a situation and purpose are formally and provably defined, your question about general agreement becomes moot, at least for it the general problem of join deletion. Since such expressions can be mechanically and efficiently proved they have far more general utility then ordinary English textual documentation no matter how formal somebody calls that text. For example being a tautology it has the advantage of making explicit just exactly which classical tautologies apply and one would think every time a database instance is transported to a different machine that instance would include a copy of its catalog which would in turn contained the tautologies it depends on which is one obvious requirement of information communication in general.
It is a mystery why whenever join deletion is brought up somebody is likely to ask well what about all views? Or, don't projections, aggregates etc need to be deletable too plus need to be updateable as well? For example, in what I presume to be an attempt at extreme generality the appendix A "OR" disjunction operator it seems to be often understood as something that a data language must directly support in recordable form. Yet when end users think about disjunction perhaps not by using that word literally it is usually obvious that they are thinking about either hypothetical situations or a manipulation of data that has already been recorded in conjunctive form. Since conjunction always logically implies or entails disjunction but not the reverse, it would seem completely unnecessary for what Codd called noninferential data systems to materialize it in any way. When the central theme of the 1970 paper as expressed by the very first sentence is followed doing so becomes simply a matter of the internal data language interface providing recorded data in the form of relations such as defined by TTM to a distinct component or language that doesn't necessarily have the logical completeness and consistency properties such as a Turing language. At that level disjunctive results based on disjunctions of relations rather than disjunctive relations can be provided to users without any logical complications because what the internal noninferential language interface will accept for recording is only non- disjunctive relations.
(Several times Codd publicly acknowledged a 1988 paper HW Buff which advocated a reformulation of his view updateability rule. It is actually a very short paper and uses an example of disjunction insertion and also cites proofs involving mathematical group theory. With the approach above it seems to be moot. However in his 1990 book Codd does appear to disallow disjunction insertion. It's not quite clear whether he says that simply because he thought his sub language good logically be used to implement such or whether he was just trying to emphasize that it shouldn't be used that way. But otherwise the book was consistent with the earlier assumption that joins are always deletable insertable and updatable.)
It's also not clear why users would ever want to insert extensions of proper projections of relations. One or more relations can logically entail a given projection value can't logically entail every unrecorded projection of a relation unless there are such constraints on the target relation that the values are inferable by existing data and thus don't need to be recorded in the first place. The situation with aggregates seems similar.
If the logical validity of join updates is accepted then it becomes pretty clear that from the data model point of view there is no real reason to distinguish base or real relations and virtual relations or views. They all become the same thing.
In a previous post you mentioned "back-mapping". Codd mentioned an idea he called backtracing. Thinking in terms of logical equivalents in other words conjunctions of entailments including entailments that entail other entailments recursing as far as necessary seems to be just as useful an approach as the above brief description and perhaps potentially more so.
Quote from p c on July 19, 2021, 12:47 pmQuote from David Livingstone on July 15, 2021, 10:03 pmQuote from dandl on July 14, 2021, 12:38 amThat wasn't my question. I want to know:
- Would a formal logical model include views?
- Do you have such a model you can show us?
- How would that resolve the problem of view updating?
IMO views are a kind of query expession, expressed as a WFF, not part of the underlying model. View updating is relational assignment, in which WFFs are evaluated and assigned to one or more relvars. The task of constructing those WFFs falls to the user/developer and cannot in general be derived mechanically from the logical model. Please confirm or refute.
-----------
"...IMO views are a kind of query expession, expressed as a WFF, not part of the underlying model. …"
According to Codd his relational data model entails the validity of certain WFFs. Two of the most elementary relational queries are ( R ) and ( NOT R ). They are also WFFs.
If the quote is correct it must mean that the predicate "not part of" entails the predicate "does not entail".
So if all that is the case, what does entail the eligibility of those WFFs? Implementation? Or something other than a relational data model?
Quote from David Livingstone on July 15, 2021, 10:03 pmQuote from dandl on July 14, 2021, 12:38 amThat wasn't my question. I want to know:
- Would a formal logical model include views?
- Do you have such a model you can show us?
- How would that resolve the problem of view updating?
IMO views are a kind of query expession, expressed as a WFF, not part of the underlying model. View updating is relational assignment, in which WFFs are evaluated and assigned to one or more relvars. The task of constructing those WFFs falls to the user/developer and cannot in general be derived mechanically from the logical model. Please confirm or refute.
-----------
"...IMO views are a kind of query expession, expressed as a WFF, not part of the underlying model. …"
According to Codd his relational data model entails the validity of certain WFFs. Two of the most elementary relational queries are ( R ) and ( NOT R ). They are also WFFs.
If the quote is correct it must mean that the predicate "not part of" entails the predicate "does not entail".
So if all that is the case, what does entail the eligibility of those WFFs? Implementation? Or something other than a relational data model?
Quote from p c on July 19, 2021, 1:23 pmQuote from David Livingstone on July 18, 2021, 4:03 pmAnother 2 days have elapsed. Has at least a broad consensus on 'view updating' arrived ?
... "I wouldn't presume to comment on a 'DBMS sub-language', if only because I would've thought it would depend somewhat on the character of the language underlying it. ..."
"... I wouldn't presume to comment on a 'DBMS sub-language', if only because I would've thought it would depend somewhat on the character of the language underlying it. …"
In other words the model depends on the language implementation, so implementation determines the model.
But the purpose of "a relational model of data" was to determine (eg., entail) the implementation of (noninferential relational data) languages.
Quote from David Livingstone on July 18, 2021, 4:03 pmAnother 2 days have elapsed. Has at least a broad consensus on 'view updating' arrived ?
... "I wouldn't presume to comment on a 'DBMS sub-language', if only because I would've thought it would depend somewhat on the character of the language underlying it. ..."
"... I wouldn't presume to comment on a 'DBMS sub-language', if only because I would've thought it would depend somewhat on the character of the language underlying it. …"
In other words the model depends on the language implementation, so implementation determines the model.
But the purpose of "a relational model of data" was to determine (eg., entail) the implementation of (noninferential relational data) languages.
Quote from dandl on July 19, 2021, 1:35 pmI have clear in my mind that there are 3 distinct things here:
- The relational model or RM, a way of storing facts, with features including relation, heading, attribute, tuple, predicate, domain.
- The relational algeba or RA (and related calculus), with features including operators and other symbols, such that WFFs can be written to satisfy a wide range of queries on the RM as new relation values.
- A programming language, with features including a type system, variables, control flow and relational assignment, such that programs can be written to manipulate data expressed as an RM.
These distinctions are clear to me and very clear in Codd 1972, but not always so clear in TTM.
IMO the RM has no operators and thus no WFFs and thus no views. To answer your question, it is the algebra that determines which strings of symbols are valid WFFs, but something else is needed to determine whether a particular WFF is a view. I would be interested in your thoughts on what that something might be.
I have clear in my mind that there are 3 distinct things here:
- The relational model or RM, a way of storing facts, with features including relation, heading, attribute, tuple, predicate, domain.
- The relational algeba or RA (and related calculus), with features including operators and other symbols, such that WFFs can be written to satisfy a wide range of queries on the RM as new relation values.
- A programming language, with features including a type system, variables, control flow and relational assignment, such that programs can be written to manipulate data expressed as an RM.
These distinctions are clear to me and very clear in Codd 1972, but not always so clear in TTM.
IMO the RM has no operators and thus no WFFs and thus no views. To answer your question, it is the algebra that determines which strings of symbols are valid WFFs, but something else is needed to determine whether a particular WFF is a view. I would be interested in your thoughts on what that something might be.
Quote from David Livingstone on July 19, 2021, 6:14 pmQuote from p c on July 19, 2021, 1:23 pmIn other words the model depends on the language implementation, so implementation determines the model.
What I had in mind was different language 'styles', which TTM explicitly permits. See https://www.dcs.warwick.ac.uk/~hugh/TTM/ (under Projects) for details. The creators of TTM explicitly did not want to specify that TTM must be implemented as an imperative language / functional language / OO language, ... (add your own preferance in here) ... .
I think of the different language 'styles' representing the same abstract relational model in different ways, with appropriate concrete syntax, according to the creator's preferances. In practice, they may not all implement absolutely the same relational model - we're all human - but I suspect there is greater variation in how much of the relational model is currently implemented.
Quote from p c on July 19, 2021, 1:23 pmIn other words the model depends on the language implementation, so implementation determines the model.
What I had in mind was different language 'styles', which TTM explicitly permits. See https://www.dcs.warwick.ac.uk/~hugh/TTM/ (under Projects) for details. The creators of TTM explicitly did not want to specify that TTM must be implemented as an imperative language / functional language / OO language, ... (add your own preferance in here) ... .
I think of the different language 'styles' representing the same abstract relational model in different ways, with appropriate concrete syntax, according to the creator's preferances. In practice, they may not all implement absolutely the same relational model - we're all human - but I suspect there is greater variation in how much of the relational model is currently implemented.