The Forum for Discussion about The Third Manifesto and Related Matters

Please or Register to create posts and topics.

Accessing database management systems from application programs.

[This deserves its own thread...]

Quote from dandl on July 6, 2019, 1:35 am
Quote from Dave Voorhis on July 5, 2019, 7:53 am
Quote from Erwin on July 5, 2019, 6:55 am
Quote from Dave Voorhis on July 4, 2019, 9:28 pm

Nothing, other than meeting some ill-defined goal of "logical data independence", i.e., to be able to modify database schemas without modifying the programs that use them, for a vanishingly small number of use cases where it could work and, at best, might make a little sense if there weren't more reasonable alternatives like scrapping the notion of update-through-views and simply modifying the programs that access the database. It's not that difficult.

Indeed, considering how to build automated software engineering tools to track when database schema changes necessitate application program changes -- and reexamining coupling between applications and database management systems in general -- would be a far more productive expenditure of time and mental effort than yet another fruitless round of debating update-through-views.

I'm all for.  Where do we start ?  What would the "considering" and "reexamining" even look like ?

I don't know. Discussion, I guess?

I observe that relatively successful logical data independence is typically achieved with a change in abstraction, such wrapping database access (i.e., queries against relvars/tables) with a procedural interface (e.g., stored procedures), or wrapping one API in another conceptually-different API.

Achieving logical data independence appears (at least so far) to be generally unworkable when trying to retain the same level of abstraction, such as wrapping base relvars in virtual relvars.

I've suggested before (as have others) that we'd gain more benefit -- compared to update-through-views -- from always wrapping database updates and shared business logic in stored procedures / operators, and providing automated mechanisms to expose those procedures / operators (and only those procedures / operators; no query-level database updates, only read-only ad hoc queries for reporting) to application programs, along with automated mechanisms to identify breaking API changes.

Of course, this means updates must be manually defined (same for PostgreSQL's RULEs or SQL Server's INSTEAD OF triggers, before anyone mentions those) but until there is some sensible general and agreed-upon specification of update-through-views -- and not, as it is currently, an endless parade of exceptions and contradictions -- it's the best we've got...

...unless there is some other shift in abstraction level that we haven't yet considered.

I'm a big fan of designing a procedural API for use by the application programmers, as long as they factor properly. You need queries that take stream arguments and return stream values, which can be combined by RA operators, to give query optimisation a chance to work. Ditto updates.

LINQ for SQL is not perfect, but gives an idea of what is possible.

A typical "modern" database-driven application architecture uses a set of stored procedures in the database to encapsulate database updates and shared business logic. Those stored procedures are exposed, typically via a RESTful interface or some other remote procedure call (RPC) mechanism. Application programs update the database only by calling the stored procedures, though they may also issue read-only queries (via constructed strings, etc.) for reporting, UI displays, etc. For these, LINQ for SQL and other facilities for constructing database language queries on the client-side are fine; often better than constructing database language queries as client-side strings, though I wouldn't want to give up that facility.

This approach generally avoids query language injection and provides some degree of logical data independence.

What is helpful is to have the RESTful interface or other RPC mechanism -- and associated client-side method/procedure stubs -- generated automatically by the DBMS itself or associated tools, with changes tracked by the client-side so that breaking schema and/or stored procedure changes can be easily identified.

I had a student build a prototype of this for MySQL, supporting C# on the client-side. It showed considerable promise, and certainly saved time and effort over doing the same manually.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
Quote from Dave Voorhis on July 6, 2019, 9:34 am

[This deserves its own thread...]

Quote from dandl on July 6, 2019, 1:35 am
Quote from Dave Voorhis on July 5, 2019, 7:53 am
Quote from Erwin on July 5, 2019, 6:55 am
Quote from Dave Voorhis on July 4, 2019, 9:28 pm

Nothing, other than meeting some ill-defined goal of "logical data independence", i.e., to be able to modify database schemas without modifying the programs that use them, for a vanishingly small number of use cases where it could work and, at best, might make a little sense if there weren't more reasonable alternatives like scrapping the notion of update-through-views and simply modifying the programs that access the database. It's not that difficult.

Indeed, considering how to build automated software engineering tools to track when database schema changes necessitate application program changes -- and reexamining coupling between applications and database management systems in general -- would be a far more productive expenditure of time and mental effort than yet another fruitless round of debating update-through-views.

I'm all for.  Where do we start ?  What would the "considering" and "reexamining" even look like ?

I don't know. Discussion, I guess?

I observe that relatively successful logical data independence is typically achieved with a change in abstraction, such wrapping database access (i.e., queries against relvars/tables) with a procedural interface (e.g., stored procedures), or wrapping one API in another conceptually-different API.

Achieving logical data independence appears (at least so far) to be generally unworkable when trying to retain the same level of abstraction, such as wrapping base relvars in virtual relvars.

I've suggested before (as have others) that we'd gain more benefit -- compared to update-through-views -- from always wrapping database updates and shared business logic in stored procedures / operators, and providing automated mechanisms to expose those procedures / operators (and only those procedures / operators; no query-level database updates, only read-only ad hoc queries for reporting) to application programs, along with automated mechanisms to identify breaking API changes.

Of course, this means updates must be manually defined (same for PostgreSQL's RULEs or SQL Server's INSTEAD OF triggers, before anyone mentions those) but until there is some sensible general and agreed-upon specification of update-through-views -- and not, as it is currently, an endless parade of exceptions and contradictions -- it's the best we've got...

...unless there is some other shift in abstraction level that we haven't yet considered.

I'm a big fan of designing a procedural API for use by the application programmers, as long as they factor properly. You need queries that take stream arguments and return stream values, which can be combined by RA operators, to give query optimisation a chance to work. Ditto updates.

LINQ for SQL is not perfect, but gives an idea of what is possible.

A typical "modern" database-driven application architecture uses a set of stored procedures in the database to encapsulate database updates and shared business logic.

Whenever I hear "procedures ... business logic", my knee jerks. Why are you burying business logic inside a procedure? That's just inviting some other means of updating the database that evades the procedure and thereby evades the business logic.

The information represented in the business's database is an asset. It needs curating. If there are rules/logic associated, they must apply for every attempted update, whatever procedure or back door access method. Quite true that in SQL that's impossible: you are more or less forced to build business logic into stored procedures, triggers, etc. But TTM is aiming higher than that. All business logic should be expressible as constraints. (If not expressible, there's something wrong with the schema: it's breaking the Information Rule.) All updates must observe those constraints whether they come from stored procedures, some CRUD-type utility, or somebody at a CLI direct-keying assignment statements.

Those stored procedures are exposed, typically via a RESTful interface or some other remote procedure call (RPC) mechanism. Application programs update the database only by calling the stored procedures, though they may also issue read-only queries (via constructed strings, etc.) for reporting, UI displays, etc. For these, LINQ for SQL and other facilities for constructing database language queries on the client-side are fine; often better than constructing database language queries as client-side strings, though I wouldn't want to give up that facility.

This approach generally avoids query language injection and provides some degree of logical data independence.

I've no objection to building an easy-to-use front end for typical database transactions. It would typically marshall data to/fro, with commitment control around a 'logical unit'. It should do no more than marshalling, and perhaps some lookups for pre-validating against Foreign Key constraints. I do object if it duplicates business logic expressed in database constraints (because risk of getting out of kilter). I strongly object if it applies business logic not expressed in database constraints.

What is helpful is to have the RESTful interface or other RPC mechanism -- and associated client-side method/procedure stubs -- generated automatically by the DBMS itself or associated tools, with changes tracked by the client-side so that breaking schema and/or stored procedure changes can be easily identified.

Business logic changes are easily detected as changes to constraints -- i.e. as a change log to the catalogue relvar(s) that hold constraints. Codd Rule 4 "The data base description is represented at the logical level in the same way as ordinary data, so that authorized users can apply the same relational language to its interrogation as they apply to the regular data."/RM Pre 25. Don't need no "DBMS itself or associated tools .. changes tracked"; just need ordinary data auditing tools as for any relvar. Why do durned programmers/implementers always want 'procedural' when declarative is so much cleaner? Rant, rant ...

I had a student build a prototype of this for MySQL, supporting C# on the client-side. It showed considerable promise, and certainly saved time and effort over doing the same manually.

I'd expect this to be dead easy: changelog the catalogue; write a script to analyse the log.

Quote from AntC on July 6, 2019, 10:26 am
Quote from Dave Voorhis on July 6, 2019, 9:34 am

[This deserves its own thread...]

Quote from dandl on July 6, 2019, 1:35 am
Quote from Dave Voorhis on July 5, 2019, 7:53 am
Quote from Erwin on July 5, 2019, 6:55 am
Quote from Dave Voorhis on July 4, 2019, 9:28 pm

Nothing, other than meeting some ill-defined goal of "logical data independence", i.e., to be able to modify database schemas without modifying the programs that use them, for a vanishingly small number of use cases where it could work and, at best, might make a little sense if there weren't more reasonable alternatives like scrapping the notion of update-through-views and simply modifying the programs that access the database. It's not that difficult.

Indeed, considering how to build automated software engineering tools to track when database schema changes necessitate application program changes -- and reexamining coupling between applications and database management systems in general -- would be a far more productive expenditure of time and mental effort than yet another fruitless round of debating update-through-views.

I'm all for.  Where do we start ?  What would the "considering" and "reexamining" even look like ?

I don't know. Discussion, I guess?

I observe that relatively successful logical data independence is typically achieved with a change in abstraction, such wrapping database access (i.e., queries against relvars/tables) with a procedural interface (e.g., stored procedures), or wrapping one API in another conceptually-different API.

Achieving logical data independence appears (at least so far) to be generally unworkable when trying to retain the same level of abstraction, such as wrapping base relvars in virtual relvars.

I've suggested before (as have others) that we'd gain more benefit -- compared to update-through-views -- from always wrapping database updates and shared business logic in stored procedures / operators, and providing automated mechanisms to expose those procedures / operators (and only those procedures / operators; no query-level database updates, only read-only ad hoc queries for reporting) to application programs, along with automated mechanisms to identify breaking API changes.

Of course, this means updates must be manually defined (same for PostgreSQL's RULEs or SQL Server's INSTEAD OF triggers, before anyone mentions those) but until there is some sensible general and agreed-upon specification of update-through-views -- and not, as it is currently, an endless parade of exceptions and contradictions -- it's the best we've got...

...unless there is some other shift in abstraction level that we haven't yet considered.

I'm a big fan of designing a procedural API for use by the application programmers, as long as they factor properly. You need queries that take stream arguments and return stream values, which can be combined by RA operators, to give query optimisation a chance to work. Ditto updates.

LINQ for SQL is not perfect, but gives an idea of what is possible.

A typical "modern" database-driven application architecture uses a set of stored procedures in the database to encapsulate database updates and shared business logic.

Whenever I hear "procedures ... business logic", my knee jerks. Why are you burying business logic inside a procedure? That's just inviting some other means of updating the database that evades the procedure and thereby evades the business logic.

This thread should probably come with a glossary, because we're in a territory known for overloaded terminology. "Business rules" that can be represented by constraints should, of course, be represented by constraints. I was thinking of procedures like "generate paycheque" or "generate timetable" etc., which involve algorithmic processes that cannot be represented solely by constraints.

 

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
Quote from AntC on July 6, 2019, 10:26 am
Quote from Dave Voorhis on July 6, 2019, 9:34 am

[This deserves its own thread...]

Quote from dandl on July 6, 2019, 1:35 am
Quote from Dave Voorhis on July 5, 2019, 7:53 am
Quote from Erwin on July 5, 2019, 6:55 am
Quote from Dave Voorhis on July 4, 2019, 9:28 pm

Nothing, other than meeting some ill-defined goal of "logical data independence", i.e., to be able to modify database schemas without modifying the programs that use them, for a vanishingly small number of use cases where it could work and, at best, might make a little sense if there weren't more reasonable alternatives like scrapping the notion of update-through-views and simply modifying the programs that access the database. It's not that difficult.

Indeed, considering how to build automated software engineering tools to track when database schema changes necessitate application program changes -- and reexamining coupling between applications and database management systems in general -- would be a far more productive expenditure of time and mental effort than yet another fruitless round of debating update-through-views.

I'm all for.  Where do we start ?  What would the "considering" and "reexamining" even look like ?

I don't know. Discussion, I guess?

I observe that relatively successful logical data independence is typically achieved with a change in abstraction, such wrapping database access (i.e., queries against relvars/tables) with a procedural interface (e.g., stored procedures), or wrapping one API in another conceptually-different API.

Achieving logical data independence appears (at least so far) to be generally unworkable when trying to retain the same level of abstraction, such as wrapping base relvars in virtual relvars.

I've suggested before (as have others) that we'd gain more benefit -- compared to update-through-views -- from always wrapping database updates and shared business logic in stored procedures / operators, and providing automated mechanisms to expose those procedures / operators (and only those procedures / operators; no query-level database updates, only read-only ad hoc queries for reporting) to application programs, along with automated mechanisms to identify breaking API changes.

Of course, this means updates must be manually defined (same for PostgreSQL's RULEs or SQL Server's INSTEAD OF triggers, before anyone mentions those) but until there is some sensible general and agreed-upon specification of update-through-views -- and not, as it is currently, an endless parade of exceptions and contradictions -- it's the best we've got...

...unless there is some other shift in abstraction level that we haven't yet considered.

I'm a big fan of designing a procedural API for use by the application programmers, as long as they factor properly. You need queries that take stream arguments and return stream values, which can be combined by RA operators, to give query optimisation a chance to work. Ditto updates.

LINQ for SQL is not perfect, but gives an idea of what is possible.

A typical "modern" database-driven application architecture uses a set of stored procedures in the database to encapsulate database updates and shared business logic.

Whenever I hear "procedures ... business logic", my knee jerks. Why are you burying business logic inside a procedure? That's just inviting some other means of updating the database that evades the procedure and thereby evades the business logic.

The information represented in the business's database is an asset. It needs curating. If there are rules/logic associated, they must apply for every attempted update, whatever procedure or back door access method. Quite true that in SQL that's impossible: you are more or less forced to build business logic into stored procedures, triggers, etc. But TTM is aiming higher than that. All business logic should be expressible as constraints. (If not expressible, there's something wrong with the schema: it's breaking the Information Rule.) All updates must observe those constraints whether they come from stored procedures, some CRUD-type utility, or somebody at a CLI direct-keying assignment statements.

Those stored procedures are exposed, typically via a RESTful interface or some other remote procedure call (RPC) mechanism. Application programs update the database only by calling the stored procedures, though they may also issue read-only queries (via constructed strings, etc.) for reporting, UI displays, etc. For these, LINQ for SQL and other facilities for constructing database language queries on the client-side are fine; often better than constructing database language queries as client-side strings, though I wouldn't want to give up that facility.

This approach generally avoids query language injection and provides some degree of logical data independence.

I've no objection to building an easy-to-use front end for typical database transactions. It would typically marshall data to/fro, with commitment control around a 'logical unit'. It should do no more than marshalling, and perhaps some lookups for pre-validating against Foreign Key constraints. I do object if it duplicates business logic expressed in database constraints (because risk of getting out of kilter). I strongly object if it applies business logic not expressed in database constraints.

What is helpful is to have the RESTful interface or other RPC mechanism -- and associated client-side method/procedure stubs -- generated automatically by the DBMS itself or associated tools, with changes tracked by the client-side so that breaking schema and/or stored procedure changes can be easily identified.

Business logic changes are easily detected as changes to constraints -- i.e. as a change log to the catalogue relvar(s) that hold constraints. Codd Rule 4 "The data base description is represented at the logical level in the same way as ordinary data, so that authorized users can apply the same relational language to its interrogation as they apply to the regular data."/RM Pre 25. Don't need no "DBMS itself or associated tools .. changes tracked"; just need ordinary data auditing tools as for any relvar. Why do durned programmers/implementers always want 'procedural' when declarative is so much cleaner? Rant, rant ...

Imagine you've created a UI form in HTML5/Javascript, or Java, or C# -- i.e., typical client-side application development languages -- entitled "Add New Customer" with a Submit button at the bottom. What should happen, code-wise, when that button gets pushed to insert the information in the form fields into a table/relvar in a SQL/D database?

That's what this is about. Not about procedural vs declarative in general, but about the practical realities of interfacing typical real-world front-end environments with middleware or Web services and/or back-end DBMSs.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
Quote from Dave Voorhis on July 6, 2019, 10:35 am

This thread should probably come with a glossary, because we're in a territory known for overloaded terminology. "Business rules" that can be represented by constraints should, of course, be represented by constraints. I was thinking of procedures like "generate paycheque" or "generate timetable" etc., which involve algorithmic processes that cannot be represented solely by constraints.

For a simpler example of what people who categorize business rules call "derivations", consider this: given the size of a screen and its aspect ratio, find the smallest possible shipping box that can hold it.  This is not exclusively mathematical, as shipping boxes come in fixed sizes, which need to be kept in a relation.  But unless exponentiation is provided as part of the language (it is not in Tutorial D), this will require the use of an algorithm, as there is no closed-form solution based solely on + - * /.

Now of course any constraint can trivially have a closed form by expanding what counts as an elementary function.  Indeed, exponentiation, logarithms, and the trig functions (on real numbers, at least) are usually taken as elementary in both mathematician-speak and programming, and they have well-known approximations.  But even the more common special functions like gamma, erf, and Bessel functions of both kinds (and the close variants of these) are usually excluded from the title of "elementary", though they too can be approximated.

The other two kinds of business rules in the standard categorization are definitions (typically non-contextual) and facts (which of course are what's in the database in the first place, unless indeed they are definitions by another name, like "a customer can buy a product", effectively a definition of "customer").