The problem of company cars

#11 · March 2, 2020, 6:56 am

Quote from AntC on March 2, 2020, 6:56 am

Quote from dandl on March 2, 2020, 4:29 am

Quote from johnwcowan on March 2, 2020, 2:41 am

Quote from AntC on March 2, 2020, 12:42 am

A relvar name (or attribute name) in an attribute quacks like a pointer to me. It is not business-meaningful, although you might try to name your relvars/attributes using words in a natural language. Good luck with that over thousands of relvars and tens of thousands of attribute names. The relvar name points to a db-internal structure just as much as does an ObjectId or a memory address.

This is precisely the argument people working and thinking in first-order languages like Java and C# make against first-class procedures. Eventually they discover that the first-order viewpoint is too restrictive and engage in one or more of various kludges to work around the lack of them.

Yes, I'm with you on this one. I remember people praising Fortran, and Cobol, and Algol similar terms. None of them provide indirection, so why need it?

How did you two jump from 'first-class' to 'first-order'? Why are you not getting that I'm talking about database content/structure, not "procedures" or behaviour or functions.

A database schema is not a programming language. Furthermore the predicate logic to underpin the RM must be first-order (TCLOSE possibly excepted).

Again: I am not excluding variables in the D programming language from holding references to relvars or attributes. I'm not excluding holding references to procedures/functions/lambdas/whatever. Just not in the database.

And still you're not coming up with any real-life cases. This is David's usual high-flown hot air.

The solution I've outlined needs one extra form of constraint wot SQL does not 'ave. No code. But while we're at it, let's support constraints being arbitrary Boolean conditions using relational operators.

Well, now that I know what exclusion constraints are, I will investigate them. But in general I favor the strongest possible meta-constraints on the form of constraints.

Not sure what you mean by "meta-constraints". Constraints under TTM are (first-order) Boolean expressions (if you want to call that a 'constraint' on constraints). Constraints under SQL (excluding triggers) are Uniqueness (keys) and Foreign Key constraints; that's so limited as to be unable to express realistic business requirements.

A decent strong(?) meta-constraint is: don't put pointers in the database [OO Pre 2.]; don't put references to relvars or to attributes, because they're tantamount to pointers.

Quote from dandl on March 2, 2020, 4:29 am

Quote from johnwcowan on March 2, 2020, 2:41 am

Quote from AntC on March 2, 2020, 12:42 am

A relvar name (or attribute name) in an attribute quacks like a pointer to me. It is not business-meaningful, although you might try to name your relvars/attributes using words in a natural language. Good luck with that over thousands of relvars and tens of thousands of attribute names. The relvar name points to a db-internal structure just as much as does an ObjectId or a memory address.

This is precisely the argument people working and thinking in first-order languages like Java and C# make against first-class procedures. Eventually they discover that the first-order viewpoint is too restrictive and engage in one or more of various kludges to work around the lack of them.

Yes, I'm with you on this one. I remember people praising Fortran, and Cobol, and Algol similar terms. None of them provide indirection, so why need it?

How did you two jump from 'first-class' to 'first-order'? Why are you not getting that I'm talking about database content/structure, not "procedures" or behaviour or functions.

A database schema is not a programming language. Furthermore the predicate logic to underpin the RM must be first-order (TCLOSE possibly excepted).

Again: I am not excluding variables in the D programming language from holding references to relvars or attributes. I'm not excluding holding references to procedures/functions/lambdas/whatever. Just not in the database.

And still you're not coming up with any real-life cases. This is David's usual high-flown hot air.

The solution I've outlined needs one extra form of constraint wot SQL does not 'ave. No code. But while we're at it, let's support constraints being arbitrary Boolean conditions using relational operators.

Well, now that I know what exclusion constraints are, I will investigate them. But in general I favor the strongest possible meta-constraints on the form of constraints.

Not sure what you mean by "meta-constraints". Constraints under TTM are (first-order) Boolean expressions (if you want to call that a 'constraint' on constraints). Constraints under SQL (excluding triggers) are Uniqueness (keys) and Foreign Key constraints; that's so limited as to be unable to express realistic business requirements.

A decent strong(?) meta-constraint is: don't put pointers in the database [OO Pre 2.]; don't put references to relvars or to attributes, because they're tantamount to pointers.

#12 · March 2, 2020, 10:57 am

Quote from Erwin on March 2, 2020, 10:57 am

Quote from johnwcowan on March 1, 2020, 5:42 am

This is a followup to the "Programmer as Navigator" thread, talking about a problem that is hard to solve in a model like TTM where named entities are second-class.

A company's database has the usual sort of employee and department relvars, the details being unimportant. There is also a relvar representing the cars owned by the company. At any time, a car may be assigned to an employee for their use, a department for use by any of its members, or not assigned. How is this to be best represented?

If there were only employee cars, it would suffice to have a relation E with attributes EMP# and CAR# with a uniqueness constraint on the CAR#, so that no car is linked to more than one employee. Ditto for department cars only, using an anologous relation D. But since both cases are needed, if we create both E and D, then there is the risk that a car might wind up assigned to both a unique employee and a unique department. This does not meet the conditions of the problem.

Alternatively, one can unify E and D into a single relation A with an EMP# and an ASSIGNEE#. The linkage from an ASSIGNEE# to an EMP# would be relation E', and to a DEPT# would be relation D'. But then we have re-created the problem: no guarantee that the same ASSIGNEE# does not map to both an EMP# and a DEPT#! A practical but not theoretically sound resolution is to see to it that EMP# and DEPT# have disjoint values, and then the two secondary linkage relations would be trivial and need not be expressed as real relations.

Finally, this problem, like all problems, can be solved by a general database constraint.

What on earth makes you say "hard to solve in a model like TTM where [...]." ? A "model like TTM" has general database constraints. So use them ...

Just design relvars CAR_ASSGN_EMP {CAR# CAR#, EMP# EMP#} and CAR_ASSGN_DEPT {CAR# CAR#, DEPT# DEPT#} with the usual foreign keys to relvars CARS, EMP and DEPT, and then declare the additional db constraint

IS_EMPTY (CAR_ASSGN_EMP JOIN CAR_ASSGN_DEPT).

What is "hard" about that ? In a TTM system that's so trivial/mundane/... as to not even being worth a post here ...

All the traditional SQL tricks to deal with this kind of scenarion are doable to, but are typically more laborious as well as more obfuscating of the constraint itself.

Quote from johnwcowan on March 1, 2020, 5:42 am

This is a followup to the "Programmer as Navigator" thread, talking about a problem that is hard to solve in a model like TTM where named entities are second-class.

A company's database has the usual sort of employee and department relvars, the details being unimportant. There is also a relvar representing the cars owned by the company. At any time, a car may be assigned to an employee for their use, a department for use by any of its members, or not assigned. How is this to be best represented?

If there were only employee cars, it would suffice to have a relation E with attributes EMP# and CAR# with a uniqueness constraint on the CAR#, so that no car is linked to more than one employee. Ditto for department cars only, using an anologous relation D. But since both cases are needed, if we create both E and D, then there is the risk that a car might wind up assigned to both a unique employee and a unique department. This does not meet the conditions of the problem.

Alternatively, one can unify E and D into a single relation A with an EMP# and an ASSIGNEE#. The linkage from an ASSIGNEE# to an EMP# would be relation E', and to a DEPT# would be relation D'. But then we have re-created the problem: no guarantee that the same ASSIGNEE# does not map to both an EMP# and a DEPT#! A practical but not theoretically sound resolution is to see to it that EMP# and DEPT# have disjoint values, and then the two secondary linkage relations would be trivial and need not be expressed as real relations.

Finally, this problem, like all problems, can be solved by a general database constraint.

What on earth makes you say "hard to solve in a model like TTM where [...]." ? A "model like TTM" has general database constraints. So use them ...

Just design relvars CAR_ASSGN_EMP {CAR# CAR#, EMP# EMP#} and CAR_ASSGN_DEPT {CAR# CAR#, DEPT# DEPT#} with the usual foreign keys to relvars CARS, EMP and DEPT, and then declare the additional db constraint

IS_EMPTY (CAR_ASSGN_EMP JOIN CAR_ASSGN_DEPT).

What is "hard" about that ? In a TTM system that's so trivial/mundane/... as to not even being worth a post here ...

All the traditional SQL tricks to deal with this kind of scenarion are doable to, but are typically more laborious as well as more obfuscating of the constraint itself.

Author of SIRA_PRISE

#13 · March 2, 2020, 12:30 pm

Quote from dandl on March 2, 2020, 4:29 am

Quote from johnwcowan on March 2, 2020, 2:41 am

Quote from AntC on March 2, 2020, 12:42 am

A relvar name (or attribute name) in an attribute quacks like a pointer to me. It is not business-meaningful, although you might try to name your relvars/attributes using words in a natural language. Good luck with that over thousands of relvars and tens of thousands of attribute names. The relvar name points to a db-internal structure just as much as does an ObjectId or a memory address.

This is precisely the argument people working and thinking in first-order languages like Java and C# make against first-class procedures. Eventually they discover that the first-order viewpoint is too restrictive and engage in one or more of various kludges to work around the lack of them.

Yes, I'm with you on this one. I remember people praising Fortran, and Cobol, and Algol similar terms. None of them provide indirection, so why need it?

But be careful lumping C# in with Java. C# supports first-class procedures in most scenarios, and you can usually write elegant FP-like code and avoid destructive assignments. You can pass the function to the data just as well as the other way around. It's not all roses, but it's getting quite a nice smell to it.

Java (at least as at v8) has streams and lambdas, but they don't play nicely with the rest of the language, and you get to write a lot of stuff the compiler could work out for itself.

Nonsense.

There are occasions where you may have to explicitly specify a type, but in the thousands of lines of working Java Streams code I've written, I can count the number of times I have had to do that on one hand. And it was almost invariably because I'd written bad code to begin with.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org

#14 · March 2, 2020, 6:30 pm

The disjoint relations can be expressed in terms of projections of a join.Although this would be unfamiliar to many users it can also be expressed in terms of a join update.

Assume C represents cars, D represents departments and E represents employees.

A fairy easy characteristic equivalence is:

(-(C& D & E ) = -( C &E ) & (C & D)) = (C & D).

Because it is logically valid any of the three terms can be deduced from the others.

The deletion of the of the join C D E ( C join D join E) can be effected in multiple equivalent ways, for example

CDE minus CDE{C,E}} joim CDE{C,D} minus CDE.

A dbms that can recognize suitable declarations for the two projections cann deduce their values from the overall equivalence, likewise can give their visual representation in terms of fragments of a join.

#15 · March 2, 2020, 8:06 pm

Quote from p c on March 2, 2020, 6:30 pm

The disjoint relations can be expressed in terms of projections of a join.Although this would be unfamiliar to many users it can also be expressed in terms of a join update.

Assume C represents cars, D represents departments and E represents employees.

A fairy easy characteristic equivalence is:

(-(C& D & E ) = -( C &E ) & (C & D)) = (C & D).

Because it is logically valid any of the three terms can be deduced from the others.

The deletion of the of the join C D E ( C join D join E) can be effected in multiple equivalent ways, for example

CDE minus CDE{C,E}} joim CDE{C,D} minus CDE.

A dbms that can recognize suitable declarations for the two projections cann deduce their values from the overall equivalence, likewise can give their visual representation in terms of subset fragments of a join.

#16 · March 2, 2020, 11:18 pm

Quote from AntC on March 2, 2020, 6:56 am

Not sure what you mean by "meta-constraints".

I mean constraints on the form of constraints. Obviously all constraints can be converted to Boolean expressions which constrain a database as a whole, and these in turn can be converted to a single constraint per database. Nothing else is logically necessary.

However, even Tutorial D also provides domain constraints (in the form of types attached to attributes) and key constraints through explicit syntax. I am interested in the questions of what other forms of constraints are sufficiently useful to justify their own syntax, and whether some basis set of such forms eliminates the practical (not theoretical) need for general constraints. I don't claim to know the answers.

Quote from Erwin on March 2, 2020, 10:57 am

What on earth makes you say "hard to solve in a model like TTM where [...]." ? A "model like TTM" has general database constraints.

By "a model like TTM" I mean "a model similar to TTM but distinct from it". This construction is inherently ambiguous in English.

#17 · March 2, 2020, 11:20 pm

I note that PostgreSQL has something called exclusion constraints which I do not yet understand.

#18 · March 3, 2020, 12:11 am

Quote from dandl on March 3, 2020, 12:11 am

Quote from AntC on March 2, 2020, 6:56 am

Quote from dandl on March 2, 2020, 4:29 am

Quote from johnwcowan on March 2, 2020, 2:41 am

Quote from AntC on March 2, 2020, 12:42 am

A relvar name (or attribute name) in an attribute quacks like a pointer to me. It is not business-meaningful, although you might try to name your relvars/attributes using words in a natural language. Good luck with that over thousands of relvars and tens of thousands of attribute names. The relvar name points to a db-internal structure just as much as does an ObjectId or a memory address.

This is precisely the argument people working and thinking in first-order languages like Java and C# make against first-class procedures. Eventually they discover that the first-order viewpoint is too restrictive and engage in one or more of various kludges to work around the lack of them.

Yes, I'm with you on this one. I remember people praising Fortran, and Cobol, and Algol similar terms. None of them provide indirection, so why need it?

How did you two jump from 'first-class' to 'first-order'? Why are you not getting that I'm talking about database content/structure, not "procedures" or behaviour or functions.

A database schema is not a programming language. Furthermore the predicate logic to underpin the RM must be first-order (TCLOSE possibly excepted).

A TTM database makes no such distinction. The relvar attributes are typed values, and those types are defined in a D, using arbitrarily complex programming logic. Ditto for constraints, which are arbitrarily complex boolean expressions (or equivalent). You can write your predicates in English ("S: Supplier [S#] is named [SNAME] and has a status of [STATUS] and is located in city [CITY]") but TTM places no such limits on the operating database.

Again: I am not excluding variables in the D programming language from holding references to relvars or attributes. I'm not excluding holding references to procedures/functions/lambdas/whatever. Just not in the database.

If those things are in the D language then they can be used to define persistent types, operators on those types and constraints, all of which form part of the database schema.

And still you're not coming up with any real-life cases. This is David's usual high-flown hot air.

You have a real life case that's good enough for this purpose, you just don't see the value inherent in indirection. Exemplars serve to focus attention and explain ideas, but it's a mistake to argue for or against a principle based on the quality of the examples produced. You need to produce a cogent argument why indirection would be a bad thing.

A decent strong(?) meta-constraint is: don't put pointers in the database [OO Pre 2.]; don't put references to relvars or to attributes, because they're tantamount to pointers.

No, they really aren't. The arguments against pointers in DTATRM are all about pointers to tuples, or the equivalent object references. The intention is to encourage joins rather than following pointers from one tuple to another. It's an argument against network and object databases.

But even DTATRM makes a clear distinction between pointers and names. A pointer is not a name. And DTATRM p224 explicitly mentions (and permits) the idea:

As far as data in the database is concerned, therefore, the only kind of dereferencing D
might possibly need is an operation that would allow a database relvar to be specified, not
directly by means of its name, but indirectly by means of a variable whose value is that
name.

My proposal for indirection is to allow this kind of indirection by name for relvars, but also for attributes, constraints and operators. Not a pointer in sight. Just names.

Quote from AntC on March 2, 2020, 6:56 am

Quote from dandl on March 2, 2020, 4:29 am

Quote from johnwcowan on March 2, 2020, 2:41 am

Quote from AntC on March 2, 2020, 12:42 am

A relvar name (or attribute name) in an attribute quacks like a pointer to me. It is not business-meaningful, although you might try to name your relvars/attributes using words in a natural language. Good luck with that over thousands of relvars and tens of thousands of attribute names. The relvar name points to a db-internal structure just as much as does an ObjectId or a memory address.

This is precisely the argument people working and thinking in first-order languages like Java and C# make against first-class procedures. Eventually they discover that the first-order viewpoint is too restrictive and engage in one or more of various kludges to work around the lack of them.

Yes, I'm with you on this one. I remember people praising Fortran, and Cobol, and Algol similar terms. None of them provide indirection, so why need it?

How did you two jump from 'first-class' to 'first-order'? Why are you not getting that I'm talking about database content/structure, not "procedures" or behaviour or functions.

A database schema is not a programming language. Furthermore the predicate logic to underpin the RM must be first-order (TCLOSE possibly excepted).

A TTM database makes no such distinction. The relvar attributes are typed values, and those types are defined in a D, using arbitrarily complex programming logic. Ditto for constraints, which are arbitrarily complex boolean expressions (or equivalent). You can write your predicates in English ("S: Supplier [S#] is named [SNAME] and has a status of [STATUS] and is located in city [CITY]") but TTM places no such limits on the operating database.

Again: I am not excluding variables in the D programming language from holding references to relvars or attributes. I'm not excluding holding references to procedures/functions/lambdas/whatever. Just not in the database.

If those things are in the D language then they can be used to define persistent types, operators on those types and constraints, all of which form part of the database schema.

And still you're not coming up with any real-life cases. This is David's usual high-flown hot air.

You have a real life case that's good enough for this purpose, you just don't see the value inherent in indirection. Exemplars serve to focus attention and explain ideas, but it's a mistake to argue for or against a principle based on the quality of the examples produced. You need to produce a cogent argument why indirection would be a bad thing.

A decent strong(?) meta-constraint is: don't put pointers in the database [OO Pre 2.]; don't put references to relvars or to attributes, because they're tantamount to pointers.

No, they really aren't. The arguments against pointers in DTATRM are all about pointers to tuples, or the equivalent object references. The intention is to encourage joins rather than following pointers from one tuple to another. It's an argument against network and object databases.

But even DTATRM makes a clear distinction between pointers and names. A pointer is not a name. And DTATRM p224 explicitly mentions (and permits) the idea:

As far as data in the database is concerned, therefore, the only kind of dereferencing D
might possibly need is an operation that would allow a database relvar to be specified, not
directly by means of its name, but indirectly by means of a variable whose value is that
name.

My proposal for indirection is to allow this kind of indirection by name for relvars, but also for attributes, constraints and operators. Not a pointer in sight. Just names.

Andl - A New Database Language - andl.org

#19 · March 3, 2020, 12:23 am

Quote from AntC on March 3, 2020, 12:23 am

Quote from johnwcowan on March 2, 2020, 11:18 pm

Quote from AntC on March 2, 2020, 6:56 am

Not sure what you mean by "meta-constraints".

I mean constraints on the form of constraints. Obviously all constraints can be converted to Boolean expressions ...

Constraints/Dependencies are predicate-based restrictions (searching for other words). I really really can't see what's wrong with Boolean expressions. It's not as if the average DB programmer won't understand them: there's AND, OR, NOT, comparisons in SQL already. By "the form of constraints" you can only mean the SQL form. If you go to 'Theory of Constraints' or 'Constraint-based programming/solving' they use Boolean expressions, because it's blimmin' obvious those are both succinct and expressively complete.

You mean some sort of shorthand for a Boolean expression? I don't see what could be more succinct than the expression Erwin and I gave. You mean something attached to the declaration for one relvar, although it refers to others? There's in DBE Ch 13 EXAMPLE 9 that I already pointed to several possible forms.

I agree your company cars example is quite common: a foreign key constraint 'distributed' amongst at most one of several options. So in the style of DBE CH 13 AT MOST ONE OF? Except I think EXAMPLE 9 has got it backwards: the foreign key is from Employee link or Department link REFERENCES Car, so you can't declare it against Car. Perhaps Car ... FOREIGN KEY {CAR#} REFERENCED BY AT MOST ONE OF {Emp_CarLink CAR#, Dep_CarLink CAR#}? That's your shorthand? That seems counter-intuitive when foreign key constraints are usually declared against the referring-from relvar. (For the very good reason that when you CREATE TABLE Car those other relvars don't exist.) REFERENCED BY puts me in mind of INTERCAL's COME FROM statement.

which constrain a database as a whole, and these in turn can be converted to a single constraint per database. Nothing else is logically necessary.

However, even Tutorial D also provides domain constraints (in the form of types attached to attributes) and key constraints through explicit syntax. I am interested in the questions of what other forms of constraints are sufficiently useful to justify their own syntax, and whether some basis set of such forms eliminates the practical (not theoretical) need for general constraints. I don't claim to know the answers.

Please explain why you want to eliminate 'general constraints'. And why you think that's a "practical need". FFS there's already two implementations that support constraints as free-standing Boolean expressions. It ain't hard.

I note that PostgreSQL has something called exclusion constraints which I do not yet understand.

I can't (on a quick search) find Exclusion Dependency defined in D&D works. I'm sure it's in one of them. Try this or this (from Entity-Relationship modelling; I'll go wash out my mouth with soap).

Quote from johnwcowan on March 2, 2020, 11:18 pm

Quote from AntC on March 2, 2020, 6:56 am

Not sure what you mean by "meta-constraints".

I mean constraints on the form of constraints. Obviously all constraints can be converted to Boolean expressions ...

Constraints/Dependencies are predicate-based restrictions (searching for other words). I really really can't see what's wrong with Boolean expressions. It's not as if the average DB programmer won't understand them: there's AND, OR, NOT, comparisons in SQL already. By "the form of constraints" you can only mean the SQL form. If you go to 'Theory of Constraints' or 'Constraint-based programming/solving' they use Boolean expressions, because it's blimmin' obvious those are both succinct and expressively complete.

You mean some sort of shorthand for a Boolean expression? I don't see what could be more succinct than the expression Erwin and I gave. You mean something attached to the declaration for one relvar, although it refers to others? There's in DBE Ch 13 EXAMPLE 9 that I already pointed to several possible forms.

I agree your company cars example is quite common: a foreign key constraint 'distributed' amongst at most one of several options. So in the style of DBE CH 13 AT MOST ONE OF? Except I think EXAMPLE 9 has got it backwards: the foreign key is from Employee link or Department link REFERENCES Car, so you can't declare it against Car. Perhaps Car ... FOREIGN KEY {CAR#} REFERENCED BY AT MOST ONE OF {Emp_CarLink CAR#, Dep_CarLink CAR#}? That's your shorthand? That seems counter-intuitive when foreign key constraints are usually declared against the referring-from relvar. (For the very good reason that when you CREATE TABLE Car those other relvars don't exist.) REFERENCED BY puts me in mind of INTERCAL's COME FROM statement.

which constrain a database as a whole, and these in turn can be converted to a single constraint per database. Nothing else is logically necessary.

However, even Tutorial D also provides domain constraints (in the form of types attached to attributes) and key constraints through explicit syntax. I am interested in the questions of what other forms of constraints are sufficiently useful to justify their own syntax, and whether some basis set of such forms eliminates the practical (not theoretical) need for general constraints. I don't claim to know the answers.

Please explain why you want to eliminate 'general constraints'. And why you think that's a "practical need". FFS there's already two implementations that support constraints as free-standing Boolean expressions. It ain't hard.

I note that PostgreSQL has something called exclusion constraints which I do not yet understand.

I can't (on a quick search) find Exclusion Dependency defined in D&D works. I'm sure it's in one of them. Try this or this (from Entity-Relationship modelling; I'll go wash out my mouth with soap).

#20 · March 3, 2020, 12:27 am

IS_EMPTY (CAR_ASSGN_EMP JOIN CAR_ASSGN_DEPT).

I agree, this is thematic for TTM. But please note, this is an expression written in a programming language, a D, and it's part of the database schema. See RM Pre 23.

Andl - A New Database Language - andl.org

The Forum for Discussion about The Third Manifesto and Related Matters