ANNOUNCE: a relational algebra with only one primitive operator (tentative) (please check/critique)
Quote from AntC on May 14, 2021, 10:50 pmQuote from Vadim on May 14, 2021, 4:49 pmQuote from AntC on May 14, 2021, 12:36 amQuote from Vadim on May 13, 2021, 4:27 pmThe other [rather artificial] example is the database with all the empty relations.
No that's not a valid lattice structure to model relations: we must always have
DEE
, no matter how many empty relations there are. We can always say the sub-lattice fromDUM
down to lattice bottom must be a Boolean algebra (as the powerset of all possible attribute-type pairs, formed into headings). That is, if we can defineDUM
.In the database of all the empty relations DUM=DEE, otherwise you'll break some RL axioms.
Yes, it is counterintuitive, because the TTM stipulates DEE containing one tuple, and there is no way to satify this condition with empty relations. (I vaguely remember correspondence with Ervin who pointed out holes in this set based definition.)
In general, a reduction of the operation set in Universal Algebra to some basic set is understood as representing each derived operation via composition of basic operatons.
Please give a citation for that claim. I think you just made it up. Appendix A (tries to) proceed in that way. But actually fails: for example it does not reduce
RENAME
to other basic operations -- the other operations still needRENAME
or some further mechanism (not specified), as Philip Kelly pointed out many moons ago -- he proposed a mechanism 'equi-relations'.Page 11 here: https://books.google.com/books?id=siNlAn8Bm30C&pg=PA1&source=gbs_toc_r&cad=3#v=onepage&q&f=false
That's a book on Category Theory. It appears not to be written in English. I'm not claiming to be producing a Category-theoretical approach.
wrt the point at hand, I'll cite your own definition for
NotMatching
(generalised difference, you used\
operator). That has two equations, neither of which have bare(x \ y)
on one side of the equation:
(x ^ y) v (x \ y) = x
.(x ^ y) ^ (x \ y) = (x ^ y) ^ DUM
.In other words, you are not allowed to use the First Order Logic sentenses and specify a constraint on the derived operation this way; you have to exhibit an equality with the derived operation alone on the left hand side.
Then, it is terminology confusion, and you need to clarify what do you mean by "primitive operator". Analogies will get you only so far.
I'm demonstrating what I'm doing as I go along. So far as I can tell, the only person actually looking at the equations is me. I take
NatJoin
as primitive in the sense I stipulate its properties (Commutative, Idempotent, Associative), but I don't specify anything further about its behaviour or result.
Quote from Vadim on May 14, 2021, 4:49 pmQuote from AntC on May 14, 2021, 12:36 amQuote from Vadim on May 13, 2021, 4:27 pmThe other [rather artificial] example is the database with all the empty relations.
No that's not a valid lattice structure to model relations: we must always have
DEE
, no matter how many empty relations there are. We can always say the sub-lattice fromDUM
down to lattice bottom must be a Boolean algebra (as the powerset of all possible attribute-type pairs, formed into headings). That is, if we can defineDUM
.In the database of all the empty relations DUM=DEE, otherwise you'll break some RL axioms.
Yes, it is counterintuitive, because the TTM stipulates DEE containing one tuple, and there is no way to satify this condition with empty relations. (I vaguely remember correspondence with Ervin who pointed out holes in this set based definition.)
In general, a reduction of the operation set in Universal Algebra to some basic set is understood as representing each derived operation via composition of basic operatons.
Please give a citation for that claim. I think you just made it up. Appendix A (tries to) proceed in that way. But actually fails: for example it does not reduce
RENAME
to other basic operations -- the other operations still needRENAME
or some further mechanism (not specified), as Philip Kelly pointed out many moons ago -- he proposed a mechanism 'equi-relations'.Page 11 here: https://books.google.com/books?id=siNlAn8Bm30C&pg=PA1&source=gbs_toc_r&cad=3#v=onepage&q&f=false
That's a book on Category Theory. It appears not to be written in English. I'm not claiming to be producing a Category-theoretical approach.
wrt the point at hand, I'll cite your own definition for NotMatching
(generalised difference, you used \
operator). That has two equations, neither of which have bare (x \ y)
on one side of the equation:
(x ^ y) v (x \ y) = x
.(x ^ y) ^ (x \ y) = (x ^ y) ^ DUM
.
In other words, you are not allowed to use the First Order Logic sentenses and specify a constraint on the derived operation this way; you have to exhibit an equality with the derived operation alone on the left hand side.
Then, it is terminology confusion, and you need to clarify what do you mean by "primitive operator". Analogies will get you only so far.
I'm demonstrating what I'm doing as I go along. So far as I can tell, the only person actually looking at the equations is me. I take NatJoin
as primitive in the sense I stipulate its properties (Commutative, Idempotent, Associative), but I don't specify anything further about its behaviour or result.
Quote from AntC on May 14, 2021, 11:20 pmQuote from Vadim on May 14, 2021, 6:38 pmQuote from Erwin on May 14, 2021, 5:36 pmDUM === {} , DEE === {{}}
DUM is the empty set of empty sets, DEE is the set that has the empty set [empty set of tuples, that is] for its member.
Requiring the empty set to be equal to a nonempty set is the most perfect example of mathematical Harakiri I can imagine.
A relation is a pair, not a set. Therefore, you must have meant
[latex] DUM \leftrightarrow \langle \{\}, \{\}\rangle [/latex]
[latex] DEE \leftrightarrow \langle \{\}, \{ \langle\rangle \}\rangle [/latex]
Then, even if you use Kuratovski pair for binary tuple, you still have to define what is a zilliary tuple. (BTW, Kuratovski pair has virtually no utility in mathematics). Also, in contemporary mathematics you never compare the two sets verbatim. Most often they are considered to be the "same" up to some equivalence. This idea is manifested in Anthony's presentation, where DEE and DUM are considered as couple of objects "fixed" inside an algebraic structure. Anthony and myself, however, disagree how successful those axiomatic definitions are in "fixing" the DUM, and even whether "fixing" an object is really a big deal.
It's a big deal because the algebra needs to know the result/manipulable properties of an expression mentioning
DUM
. ForDEE
, it's properties are sufficiently captured by saying it's identity forNaturalJoin
.We don't need to fix
DUM
if we have an adequately manipulable definition forNotMatching
(I don't). Or we could introduce a primitive operationemptify( )
; but again what are (all) the properties of expressions mentioning it? Things liker1 NatJoin emptify(r2) = emptify(r1 NatJoin r2); emptify(r1) InnerUnion emptify(r2) = emptify(r1 InnerUnion r2); r1 NotMatching Dee = emptify(r1); ...
.Regarding your critique of Anthony's motivation, you shall not dismiss curiosity factor. I'd suggest that the database theory is still far from being cast in stone. Witness novel query languages appearing:
My o.p. started with 'Motivation'. The A algebra is not adequate to reason about the equivalence of queries. If any compiler/database engine is using A, I guess they've added all sorts of rules of thumb on top of the FORMAL DEFINITIONS. How can an industrial user of a query language be sure their queries produce the right results; or indeed how can a vendor of a DBMS? (For SQL it's almost impossible: there's no formal definition for all the corners of the semantics of SQL, and all its baroque bits of syntax, and how they interact inside a complex query.)
Quote from Vadim on May 14, 2021, 6:38 pmQuote from Erwin on May 14, 2021, 5:36 pmDUM === {} , DEE === {{}}
DUM is the empty set of empty sets, DEE is the set that has the empty set [empty set of tuples, that is] for its member.
Requiring the empty set to be equal to a nonempty set is the most perfect example of mathematical Harakiri I can imagine.
A relation is a pair, not a set. Therefore, you must have meant
DUM \leftrightarrow \langle \{\}, \{\}\rangle
DEE \leftrightarrow \langle \{\}, \{ \langle\rangle \}\rangle
Then, even if you use Kuratovski pair for binary tuple, you still have to define what is a zilliary tuple. (BTW, Kuratovski pair has virtually no utility in mathematics). Also, in contemporary mathematics you never compare the two sets verbatim. Most often they are considered to be the "same" up to some equivalence. This idea is manifested in Anthony's presentation, where DEE and DUM are considered as couple of objects "fixed" inside an algebraic structure. Anthony and myself, however, disagree how successful those axiomatic definitions are in "fixing" the DUM, and even whether "fixing" an object is really a big deal.
It's a big deal because the algebra needs to know the result/manipulable properties of an expression mentioning DUM
. For DEE
, it's properties are sufficiently captured by saying it's identity for NaturalJoin
.
We don't need to fix DUM
if we have an adequately manipulable definition for NotMatching
(I don't). Or we could introduce a primitive operation emptify( )
; but again what are (all) the properties of expressions mentioning it? Things like r1 NatJoin emptify(r2) = emptify(r1 NatJoin r2); emptify(r1) InnerUnion emptify(r2) = emptify(r1 InnerUnion r2); r1 NotMatching Dee = emptify(r1); ...
.
Regarding your critique of Anthony's motivation, you shall not dismiss curiosity factor. I'd suggest that the database theory is still far from being cast in stone. Witness novel query languages appearing:
My o.p. started with 'Motivation'. The A algebra is not adequate to reason about the equivalence of queries. If any compiler/database engine is using A, I guess they've added all sorts of rules of thumb on top of the FORMAL DEFINITIONS. How can an industrial user of a query language be sure their queries produce the right results; or indeed how can a vendor of a DBMS? (For SQL it's almost impossible: there's no formal definition for all the corners of the semantics of SQL, and all its baroque bits of syntax, and how they interact inside a complex query.)
Quote from Vadim on May 14, 2021, 11:59 pmQuote from AntC on May 14, 2021, 10:50 pmwrt the point at hand, I'll cite your own definition for
NotMatching
(generalised difference, you used\
operator). That has two equations, neither of which have bare(x \ y)
on one side of the equation:
(x ^ y) v (x \ y) = x
.(x ^ y) ^ (x \ y) = (x ^ y) ^ DUM
.Yes, because the generalized difference is not a derived operation in the Relational Lattice with "standard" signature [latex]\{\wedge,\vee,R_{00}\}[/latex]. It is the derived operation in the algebra with signature [latex]\{\wedge,\vee,\lhd NOT\rhd\}[/latex]:
x <NotMatching> y = <NOT>(<NOT>(x) v (y ^ x)).(Tested in QBQL, not in Prover9/Mace4)
Quote from AntC on May 14, 2021, 10:50 pmwrt the point at hand, I'll cite your own definition for
NotMatching
(generalised difference, you used\
operator). That has two equations, neither of which have bare(x \ y)
on one side of the equation:
(x ^ y) v (x \ y) = x
.(x ^ y) ^ (x \ y) = (x ^ y) ^ DUM
.
Yes, because the generalized difference is not a derived operation in the Relational Lattice with "standard" signature \{\wedge,\vee,R_{00}\}. It is the derived operation in the algebra with signature \{\wedge,\vee,\lhd NOT\rhd\}:
x <NotMatching> y = <NOT>(<NOT>(x) v (y ^ x)).
(Tested in QBQL, not in Prover9/Mace4)
Quote from Vadim on May 15, 2021, 12:24 amQuote from AntC on May 14, 2021, 10:50 pmI'm demonstrating what I'm doing as I go along. So far as I can tell, the only person actually looking at the equations is me. I take
NatJoin
as primitive in the sense I stipulate its properties (Commutative, Idempotent, Associative), but I don't specify anything further about its behaviour or result.Well, it doesn't really matter which operations are considered primitive, and which aren't. Your entire thought process can be abstracted away, and eventually, you arrive to a system of algebraic identities. If you can succinctly list them, then I voluneer to check it. However, if you encountered some obstacle, then it is harder to help.
Quote from AntC on May 14, 2021, 10:50 pmI'm demonstrating what I'm doing as I go along. So far as I can tell, the only person actually looking at the equations is me. I take
NatJoin
as primitive in the sense I stipulate its properties (Commutative, Idempotent, Associative), but I don't specify anything further about its behaviour or result.
Well, it doesn't really matter which operations are considered primitive, and which aren't. Your entire thought process can be abstracted away, and eventually, you arrive to a system of algebraic identities. If you can succinctly list them, then I voluneer to check it. However, if you encountered some obstacle, then it is harder to help.
Quote from AntC on May 15, 2021, 2:49 amQuote from Erwin on May 14, 2021, 4:42 pmQuote from AntC on May 14, 2021, 3:42 amwrt
REMOVE
, Appendix A is hand-wavey about how to get from a projection in Tutorial D to an equivalentALL BUT
toREMOVE
. It is also hand-wavey about how to get fromCOMPOSE
to the neededREMOVE
: how does the algebra calculate the attributes in common between two operands beingCOMPOSE
d? (IOW what is a "macro" operator in this sense? That goes back to your extended discussion about macros on another thread, and at least needs some sort of program reflection to run after the compiler has inferred the heading of the expressions that are operands toCOMPOSE
.)The compiler knows about the headings of the operands, and perhaps holds that information in the form of a relation with heading {attributename:attributenamedomain, typename:typenamedomain} and applies, for example, semijoin or semiminus, on the two relation values at hand, to reliably and deterministically compute the results you say they are "hand-wavey" about.
(IOW a "macro" operator in this sense is some operator on some data types that happens to be used by the compiler for doing the job it is supposed to do. There is no implication that within the set of all operators used by either the compiler or any of the application programs, "being a macro operator" (thus giving rise to assessable predicates like "X is a 'macro' operator and Y is too") is a characteristic that creates equivalence classes within that set. Creates >1 distinct equivalence classes, that is.)
That all seems so blatantly obvious and trivial that I'm wondering what the hell the problem is.
As an example of what A doesn't do, consider the properties of
COMPOSE
:
(r1 NatJoin r2) COMPOSE (s1 NatJoin s2) ≡ (((r1 NatJoin r2) ^ (s1 NatJoin s2)) REMOVE ?)
- I can reshuffle the lhs of
REMOVE
(and in general the lhs expression might be bigger/hairier. Can I then push-REMOVE
-through stuff?COMPOSE
is claimed to be commutative (unlike the mathematicians' version). I can see theNatJoin
lhs ofREMOVE
is commutative, but is the rhs?
??
is alleged to be a set-intersection of something, but there's no visibility of what.COMPOSE
is not in general associative (unlike the mathematician's version), but someCOMPOSE
s are associative -- for example(r1 COMPOSE s1) COMPOSE s2
providings1, s2
have no attributes in common.If SIRA_PRISE has
COMPOSE
, does it attempt to optimise expressions using it? How do your users know you didn't mess up some of the side-conditions for big/hairy expressions?
Quote from Erwin on May 14, 2021, 4:42 pmQuote from AntC on May 14, 2021, 3:42 amwrt
REMOVE
, Appendix A is hand-wavey about how to get from a projection in Tutorial D to an equivalentALL BUT
toREMOVE
. It is also hand-wavey about how to get fromCOMPOSE
to the neededREMOVE
: how does the algebra calculate the attributes in common between two operands beingCOMPOSE
d? (IOW what is a "macro" operator in this sense? That goes back to your extended discussion about macros on another thread, and at least needs some sort of program reflection to run after the compiler has inferred the heading of the expressions that are operands toCOMPOSE
.)The compiler knows about the headings of the operands, and perhaps holds that information in the form of a relation with heading {attributename:attributenamedomain, typename:typenamedomain} and applies, for example, semijoin or semiminus, on the two relation values at hand, to reliably and deterministically compute the results you say they are "hand-wavey" about.
(IOW a "macro" operator in this sense is some operator on some data types that happens to be used by the compiler for doing the job it is supposed to do. There is no implication that within the set of all operators used by either the compiler or any of the application programs, "being a macro operator" (thus giving rise to assessable predicates like "X is a 'macro' operator and Y is too") is a characteristic that creates equivalence classes within that set. Creates >1 distinct equivalence classes, that is.)
That all seems so blatantly obvious and trivial that I'm wondering what the hell the problem is.
As an example of what A doesn't do, consider the properties of COMPOSE
:
(r1 NatJoin r2) COMPOSE (s1 NatJoin s2) ≡ (((r1 NatJoin r2) ^ (s1 NatJoin s2)) REMOVE ?)
- I can reshuffle the lhs of
REMOVE
(and in general the lhs expression might be bigger/hairier. Can I then push-REMOVE
-through stuff? COMPOSE
is claimed to be commutative (unlike the mathematicians' version). I can see theNatJoin
lhs ofREMOVE
is commutative, but is the rhs?
??
is alleged to be a set-intersection of something, but there's no visibility of what.COMPOSE
is not in general associative (unlike the mathematician's version), but someCOMPOSE
s are associative -- for example(r1 COMPOSE s1) COMPOSE s2
providings1, s2
have no attributes in common.
If SIRA_PRISE has COMPOSE
, does it attempt to optimise expressions using it? How do your users know you didn't mess up some of the side-conditions for big/hairy expressions?
Quote from AntC on May 15, 2021, 3:09 amQuote from Vadim on May 14, 2021, 6:38 pmQuote from Erwin on May 14, 2021, 5:36 pmDUM === {} , DEE === {{}}
DUM is the empty set of empty sets, DEE is the set that has the empty set [empty set of tuples, that is] for its member.
Requiring the empty set to be equal to a nonempty set is the most perfect example of mathematical Harakiri I can imagine.
A relation is a pair, not a set. Therefore, you must have meant
[latex] DUM \leftrightarrow \langle \{\}, \{\}\rangle [/latex]
[latex] DEE \leftrightarrow \langle \{\}, \{ \langle\rangle \}\rangle [/latex]
Then, even if you use Kuratovski pair for binary tuple, you still have to define what is a zilliary tuple. (BTW, Kuratovski pair has virtually no utility in mathematics).
Just false. Practical mathematicians/algebraists rely on Kuratovski equivalence all the time -- even if they don't explicitly mention it. Relationalists particularly -- you'll see correspondence with D.L. Childs about how it applies.
Kuratovski is a proof of capability. We can happily carry on using our familiar ordered pairs, and know they can be represented purely as sets, so we aren't needing any further mechanisms. In TTM it's particularly easy: the only tuples we have are:
- relation values as pairs of (heading, body) -- the two of which are easily distinguishable because a heading is a set of pairs whereas a body is a set of sets of pairs. (The only ambiguous case is
DUM
, being a pair of empty sets, but then the components of the pair are the same, so no confusion). Headings are of different 'sort' vs bodies, and are (usually) simpler, so the obvious candidate to make a Kuratovski{H, {H, B}}
.- In a heading's
<A, T>
pairs, the elements are of different 'sort's, and theA
is atomic, so the obvious candidate to make a Kuratovski{A, {A, T}}
.- In a body's tuple's
<A, v>
pairs, the elements are again of different 'sort's, and theA
is atomic, so the obvious candidate to make a Kuratovski{A, {A, v}}
.
Quote from Vadim on May 14, 2021, 6:38 pmQuote from Erwin on May 14, 2021, 5:36 pmDUM === {} , DEE === {{}}
DUM is the empty set of empty sets, DEE is the set that has the empty set [empty set of tuples, that is] for its member.
Requiring the empty set to be equal to a nonempty set is the most perfect example of mathematical Harakiri I can imagine.
A relation is a pair, not a set. Therefore, you must have meant
DUM \leftrightarrow \langle \{\}, \{\}\rangle
DEE \leftrightarrow \langle \{\}, \{ \langle\rangle \}\rangle
Then, even if you use Kuratovski pair for binary tuple, you still have to define what is a zilliary tuple. (BTW, Kuratovski pair has virtually no utility in mathematics).
Just false. Practical mathematicians/algebraists rely on Kuratovski equivalence all the time -- even if they don't explicitly mention it. Relationalists particularly -- you'll see correspondence with D.L. Childs about how it applies.
Kuratovski is a proof of capability. We can happily carry on using our familiar ordered pairs, and know they can be represented purely as sets, so we aren't needing any further mechanisms. In TTM it's particularly easy: the only tuples we have are:
- relation values as pairs of (heading, body) -- the two of which are easily distinguishable because a heading is a set of pairs whereas a body is a set of sets of pairs. (The only ambiguous case is
DUM
, being a pair of empty sets, but then the components of the pair are the same, so no confusion). Headings are of different 'sort' vs bodies, and are (usually) simpler, so the obvious candidate to make a Kuratovski{H, {H, B}}
. - In a heading's
<A, T>
pairs, the elements are of different 'sort's, and theA
is atomic, so the obvious candidate to make a Kuratovski{A, {A, T}}
. - In a body's tuple's
<A, v>
pairs, the elements are again of different 'sort's, and theA
is atomic, so the obvious candidate to make a Kuratovski{A, {A, v}}
.
Quote from AntC on May 15, 2021, 6:19 amQuote from Vadim on May 14, 2021, 11:59 pmQuote from AntC on May 14, 2021, 10:50 pmwrt the point at hand, I'll cite your own definition for
NotMatching
(generalised difference, you used\
operator). That has two equations, neither of which have bare(x \ y)
on one side of the equation:
(x ^ y) v (x \ y) = x
.(x ^ y) ^ (x \ y) = (x ^ y) ^ DUM
.Yes, because the generalized difference is not a derived operation in the Relational Lattice with "standard" signature [latex]\{\wedge,\vee,R_{00}\}[/latex].
If that's the only place
DUM
appears (or rather(... NatJoin DUM)
acting as equivalent to myemptify( )
, then we do at least get some nice properties, because of the associativity/etc ofNatJoin
.We also get some nice equalities around emptifying relations (proven by Prover9):
x \ x = x NatJoin DUM; x \ DEE = x NatJoin DUM; DUM \ x = DUM; x \ DUM = x; ...
But (and this is why I abandoned takingDUM
as primitive,
- I can't prove
DEE ≠ DUM
, and Mace4 is happy to produce a model in which they're equal.- Even if I give an axiom to say they're unequal, I can't prove
DUM
is adjacent toDEE
(that is, interval one below lattice top).- I can't get Mace4 to even produce a model such that any other element
x
adjacent toDEE
hasx ^ DUM
adjacent toDUM
.- So every pair of (reasonably complicated) query expressions that I thought should be equal, Mace4 kept producing counter-models that couldn't possibly be models of a structure of relations -- even allowing for degenerate structures like all relations empty other than
DEE
, or all attribute types having a single possible value.- Trying to tell it that there should be a Boolean Algebra from
x
tox ^ DUM
, or fromDUM
to lattice bottom (yourR10
) didn't stop it producing nonsensical models. (I say "didn't stop" meaning often Mace4 just ran for half an hour, producing nothing.)It is the derived operation in the algebra with signature [latex]\{\wedge,\vee,\lhd NOT\rhd\}[/latex]:
x <NotMatching> y = <NOT>(<NOT>(x) v (y ^ x)).x <NotMatching> y = <NOT>(<NOT>(x) v (y ^ x)).x <NotMatching> y = <NOT>(<NOT>(x) v (y ^ x)).(Tested in QBQL, not in Prover9/Mace4)
I think that's going to suffer the same difficulties, due to the insufficient definedness of
<NOT>
. "Tested in QBQL" means exercised in some tool with impenetrable semantics, coded by someone who already knows what answer they want, and full of ad-hockery and rule-of-thumb black arts looking specifically for<NOT>
. If<NOT>
is a valid approach, prove it in a theorem prover where we can get a verifiable proof.
Quote from Vadim on May 14, 2021, 11:59 pmQuote from AntC on May 14, 2021, 10:50 pmwrt the point at hand, I'll cite your own definition for
NotMatching
(generalised difference, you used\
operator). That has two equations, neither of which have bare(x \ y)
on one side of the equation:
(x ^ y) v (x \ y) = x
.(x ^ y) ^ (x \ y) = (x ^ y) ^ DUM
.Yes, because the generalized difference is not a derived operation in the Relational Lattice with "standard" signature \{\wedge,\vee,R_{00}\}.
If that's the only place DUM
appears (or rather (... NatJoin DUM)
acting as equivalent to my emptify( )
, then we do at least get some nice properties, because of the associativity/etc of NatJoin
.
We also get some nice equalities around emptifying relations (proven by Prover9): x \ x = x NatJoin DUM; x \ DEE = x NatJoin DUM; DUM \ x = DUM; x \ DUM = x; ...
But (and this is why I abandoned taking DUM
as primitive,
- I can't prove
DEE ≠ DUM
, and Mace4 is happy to produce a model in which they're equal. - Even if I give an axiom to say they're unequal, I can't prove
DUM
is adjacent toDEE
(that is, interval one below lattice top). - I can't get Mace4 to even produce a model such that any other element
x
adjacent toDEE
hasx ^ DUM
adjacent toDUM
. - So every pair of (reasonably complicated) query expressions that I thought should be equal, Mace4 kept producing counter-models that couldn't possibly be models of a structure of relations -- even allowing for degenerate structures like all relations empty other than
DEE
, or all attribute types having a single possible value. - Trying to tell it that there should be a Boolean Algebra from
x
tox ^ DUM
, or fromDUM
to lattice bottom (yourR10
) didn't stop it producing nonsensical models. (I say "didn't stop" meaning often Mace4 just ran for half an hour, producing nothing.)
It is the derived operation in the algebra with signature \{\wedge,\vee,\lhd NOT\rhd\}:
x <NotMatching> y = <NOT>(<NOT>(x) v (y ^ x)).x <NotMatching> y = <NOT>(<NOT>(x) v (y ^ x)).x <NotMatching> y = <NOT>(<NOT>(x) v (y ^ x)).(Tested in QBQL, not in Prover9/Mace4)
I think that's going to suffer the same difficulties, due to the insufficient definedness of <NOT>
. "Tested in QBQL" means exercised in some tool with impenetrable semantics, coded by someone who already knows what answer they want, and full of ad-hockery and rule-of-thumb black arts looking specifically for <NOT>
. If <NOT>
is a valid approach, prove it in a theorem prover where we can get a verifiable proof.
Quote from Erwin on May 15, 2021, 6:43 amQuote from AntC on May 15, 2021, 2:49 amIf SIRA_PRISE has
COMPOSE
, does it attempt to optimise expressions using it? How do your users know you didn't mess up some of the side-conditions for big/hairy expressions?How does any software user know his software is absolutely flawless ? How does any software user know the 'vendor' (quotes because I want to include the non-commercial software providers as well) didn't mess up ? How does any user of any non-software product (a car, a dishwasher, a heating installation, ...) has been proven to be fault-free ?
I've personally seen the IBM z/OS COBOL compiler (and in COBOL land, that's THE ultimate reference) generate infinite loops on a simple MOVE. Where are the 'proofs' that DB2 didn't mess up ? Where are the 'proofs' that Access didn't mess up ? Where are the 'proofs' that Excel didn't mess up ? Where are the 'proofs' that SAS didn't mess up ? Did you ever get to see any such proof ? I didn't. And don't expect to either. If any such proofs exist, why are those vendors still spending money to test their software ? If any such proofs exist, why are those vendors still spending money on bug reporting facilities for their users ?
"Proving" that software "didn't mess up" is a hopeless endeavour. And the users will have to learn to live with that. "Know you didn't mess up" comes from using it and not running into errors all the time. And users can get to 60% confidence, or 90%, or 99%, or 99.99%, but never 100%. Even Tony Hoare once reflected that "the world doesn't seem to suffer that much from the problem we were trying to solve", that "problem" being "proving program correctness".
"Proving" that software "didn't mess up" is often also a futile endeavour. You could "prove" that an arithmetic operation yields the "correct" result, assuming integer arithmetic modulo 65536 or some such. Meaning it is "correct" to have 32767 + 1 = -32768. At which point the end user just says it's wrong/irrelevant because that's not his arithmetic and possibly declares you nuts.
As for SIRA_PRISE : no rewrites at the source level, so no errors introduced by flawed rewrite mechanisms. Also, it's just a data server (any SQL DBMS sans stored procedures and the like), so there's only DML to process and the closest thing to 'executable code' produced by the "compiler" is the particular chaining of computation (RA computation) and data access strategies associated with each operator invocation that is part of a particular given expression. 'Optimisation' is targeted at minimising physical file page accesses and so neither "memory consumption" nor "CPU time" are an optimisation target (other than that the strategies used by the engine are of course themselves as minimalistic as they can in those respects).
If you frown, you might appreciate the fact that my primary goal was and has always been to sort out the CREATE ASSERTION thing. Which I managed to achieve and for which I *did* do the maths. D&D have seen those maths, and so did Adrian Hudnott, but I don't know what became of him. I hope he managed to achieve his life goals too and is now happy.
Quote from AntC on May 15, 2021, 2:49 amIf SIRA_PRISE has
COMPOSE
, does it attempt to optimise expressions using it? How do your users know you didn't mess up some of the side-conditions for big/hairy expressions?
How does any software user know his software is absolutely flawless ? How does any software user know the 'vendor' (quotes because I want to include the non-commercial software providers as well) didn't mess up ? How does any user of any non-software product (a car, a dishwasher, a heating installation, ...) has been proven to be fault-free ?
I've personally seen the IBM z/OS COBOL compiler (and in COBOL land, that's THE ultimate reference) generate infinite loops on a simple MOVE. Where are the 'proofs' that DB2 didn't mess up ? Where are the 'proofs' that Access didn't mess up ? Where are the 'proofs' that Excel didn't mess up ? Where are the 'proofs' that SAS didn't mess up ? Did you ever get to see any such proof ? I didn't. And don't expect to either. If any such proofs exist, why are those vendors still spending money to test their software ? If any such proofs exist, why are those vendors still spending money on bug reporting facilities for their users ?
"Proving" that software "didn't mess up" is a hopeless endeavour. And the users will have to learn to live with that. "Know you didn't mess up" comes from using it and not running into errors all the time. And users can get to 60% confidence, or 90%, or 99%, or 99.99%, but never 100%. Even Tony Hoare once reflected that "the world doesn't seem to suffer that much from the problem we were trying to solve", that "problem" being "proving program correctness".
"Proving" that software "didn't mess up" is often also a futile endeavour. You could "prove" that an arithmetic operation yields the "correct" result, assuming integer arithmetic modulo 65536 or some such. Meaning it is "correct" to have 32767 + 1 = -32768. At which point the end user just says it's wrong/irrelevant because that's not his arithmetic and possibly declares you nuts.
As for SIRA_PRISE : no rewrites at the source level, so no errors introduced by flawed rewrite mechanisms. Also, it's just a data server (any SQL DBMS sans stored procedures and the like), so there's only DML to process and the closest thing to 'executable code' produced by the "compiler" is the particular chaining of computation (RA computation) and data access strategies associated with each operator invocation that is part of a particular given expression. 'Optimisation' is targeted at minimising physical file page accesses and so neither "memory consumption" nor "CPU time" are an optimisation target (other than that the strategies used by the engine are of course themselves as minimalistic as they can in those respects).
If you frown, you might appreciate the fact that my primary goal was and has always been to sort out the CREATE ASSERTION thing. Which I managed to achieve and for which I *did* do the maths. D&D have seen those maths, and so did Adrian Hudnott, but I don't know what became of him. I hope he managed to achieve his life goals too and is now happy.
Quote from AntC on May 15, 2021, 10:17 amQuote from Erwin on May 15, 2021, 6:43 amQuote from AntC on May 15, 2021, 2:49 amIf SIRA_PRISE has
COMPOSE
, does it attempt to optimise expressions using it? How do your users know you didn't mess up some of the side-conditions for big/hairy expressions?"Proving" that software "didn't mess up" is a hopeless endeavour.
I think you might be missing a memo . We use as many tools as possible to get assurance. An automated proof system is relatively cheap to adopt for something that's mission-critical. But yes cost must be commensurate with risk. I'm not expecting "absolutely flawless". I am expecting a vendor/software supplier understands the risks I'm taking on in using their software. I guess there's no risk I'm going to use your software; or even breath a word of it to my clients. I'll be warning them off using it.
How does any software user know his software is absolutely flawless ? How does any software user know the 'vendor' (quotes because I want to include the non-commercial software providers as well) didn't mess up ? How does any user of any non-software product (a car, a dishwasher, a heating installation, ...) has been proven to be fault-free ?
You drive conservatively; you don't put your grandma's heirloom willow pattern dinner service into the dishwasher, but only the cheap china; you don't leave the logburner unattended for long periods. With a DBMS it's less easy to be that sort of conservative.
If you frown, you might appreciate the fact that my primary goal was and has always been to sort out the CREATE ASSERTION thing.
ASSERTIONs are statements in logic. I would have thought a logic tool abundantly helpful.
Which I managed to achieve and for which I *did* do the maths. D&D have seen those maths, ...
D&D are not theorem-provers. You are not a theorem-prover. Your doing the maths -- even with asterisks around -- is more fallible than a theorem prover.
Quote from Erwin on May 15, 2021, 6:43 amQuote from AntC on May 15, 2021, 2:49 amIf SIRA_PRISE has
COMPOSE
, does it attempt to optimise expressions using it? How do your users know you didn't mess up some of the side-conditions for big/hairy expressions?"Proving" that software "didn't mess up" is a hopeless endeavour.
I think you might be missing a memo . We use as many tools as possible to get assurance. An automated proof system is relatively cheap to adopt for something that's mission-critical. But yes cost must be commensurate with risk. I'm not expecting "absolutely flawless". I am expecting a vendor/software supplier understands the risks I'm taking on in using their software. I guess there's no risk I'm going to use your software; or even breath a word of it to my clients. I'll be warning them off using it.
How does any software user know his software is absolutely flawless ? How does any software user know the 'vendor' (quotes because I want to include the non-commercial software providers as well) didn't mess up ? How does any user of any non-software product (a car, a dishwasher, a heating installation, ...) has been proven to be fault-free ?
You drive conservatively; you don't put your grandma's heirloom willow pattern dinner service into the dishwasher, but only the cheap china; you don't leave the logburner unattended for long periods. With a DBMS it's less easy to be that sort of conservative.
If you frown, you might appreciate the fact that my primary goal was and has always been to sort out the CREATE ASSERTION thing.
ASSERTIONs are statements in logic. I would have thought a logic tool abundantly helpful.
Which I managed to achieve and for which I *did* do the maths. D&D have seen those maths, ...
D&D are not theorem-provers. You are not a theorem-prover. Your doing the maths -- even with asterisks around -- is more fallible than a theorem prover.
Quote from AntC on May 15, 2021, 10:27 amQuote from AntC on May 15, 2021, 6:19 amQuote from Vadim on May 14, 2021, 11:59 pmQuote from AntC on May 14, 2021, 10:50 pmwrt the point at hand, I'll cite your own definition for
NotMatching
(generalised difference, you used\
operator). That has two equations, neither of which have bare(x \ y)
on one side of the equation:
(x ^ y) v (x \ y) = x
.(x ^ y) ^ (x \ y) = (x ^ y) ^ DUM
.Yes, because the generalized difference is not a derived operation in the Relational Lattice with "standard" signature [latex]\{\wedge,\vee,R_{00}\}[/latex].
If that's the only place
DUM
appears (or rather(... NatJoin DUM)
acting as equivalent to myemptify( )
, then we do at least get some nice properties, because of the associativity/etc ofNatJoin
.BTW, replacing
... ^ DUM
in that definition with... \ DEE
yields just the same equalities (below, mutatis mutandis, oh plus you have to say that\
is associative). So I can use... \ DEE
to act asemptify( )
. That's a bit of a surprise because\
appears on both sides of the definition.We also get some nice equalities around emptifying relations (proven by Prover9):
x \ x = x NatJoin DUM; x \ DEE = x NatJoin DUM; DUM \ x = DUM; x \ DUM = x; ...
But (and this is why I abandoned takingDUM
as primitive,
- I can't prove
DEE ≠ DUM
, and Mace4 is happy to produce a model in which they're equal.- Even if I give an axiom to say they're unequal, I can't prove
DUM
is adjacent toDEE
(that is, interval one below lattice top).- I can't get Mace4 to even produce a model such that any other element
x
adjacent toDEE
hasx ^ DUM
adjacent toDUM
.- So every pair of (reasonably complicated) query expressions that I thought should be equal, Mace4 kept producing counter-models that couldn't possibly be models of a structure of relations -- even allowing for degenerate structures like all relations empty other than
DEE
, or all attribute types having a single possible value.- Trying to tell it that there should be a Boolean Algebra from
x
tox ^ DUM
, or fromDUM
to lattice bottom (yourR10
) didn't stop it producing nonsensical models. (I say "didn't stop" meaning often Mace4 just ran for half an hour, producing nothing.)And
... \ DEE
fails to prove all those expected inequalities, so it's no worse. Calling your thingR00
seems to me false advertising: it's behaviour doesn't correspond to a relation value with empty (0
) heading or empty (0
) body.
Quote from AntC on May 15, 2021, 6:19 amQuote from Vadim on May 14, 2021, 11:59 pmQuote from AntC on May 14, 2021, 10:50 pmwrt the point at hand, I'll cite your own definition for
NotMatching
(generalised difference, you used\
operator). That has two equations, neither of which have bare(x \ y)
on one side of the equation:
(x ^ y) v (x \ y) = x
.(x ^ y) ^ (x \ y) = (x ^ y) ^ DUM
.Yes, because the generalized difference is not a derived operation in the Relational Lattice with "standard" signature \{\wedge,\vee,R_{00}\}.
If that's the only place
DUM
appears (or rather(... NatJoin DUM)
acting as equivalent to myemptify( )
, then we do at least get some nice properties, because of the associativity/etc ofNatJoin
.
BTW, replacing ... ^ DUM
in that definition with ... \ DEE
yields just the same equalities (below, mutatis mutandis, oh plus you have to say that \
is associative). So I can use ... \ DEE
to act as emptify( )
. That's a bit of a surprise because \
appears on both sides of the definition.
We also get some nice equalities around emptifying relations (proven by Prover9):
x \ x = x NatJoin DUM; x \ DEE = x NatJoin DUM; DUM \ x = DUM; x \ DUM = x; ...
But (and this is why I abandoned takingDUM
as primitive,
- I can't prove
DEE ≠ DUM
, and Mace4 is happy to produce a model in which they're equal.- Even if I give an axiom to say they're unequal, I can't prove
DUM
is adjacent toDEE
(that is, interval one below lattice top).- I can't get Mace4 to even produce a model such that any other element
x
adjacent toDEE
hasx ^ DUM
adjacent toDUM
.- So every pair of (reasonably complicated) query expressions that I thought should be equal, Mace4 kept producing counter-models that couldn't possibly be models of a structure of relations -- even allowing for degenerate structures like all relations empty other than
DEE
, or all attribute types having a single possible value.- Trying to tell it that there should be a Boolean Algebra from
x
tox ^ DUM
, or fromDUM
to lattice bottom (yourR10
) didn't stop it producing nonsensical models. (I say "didn't stop" meaning often Mace4 just ran for half an hour, producing nothing.)
And ... \ DEE
fails to prove all those expected inequalities, so it's no worse. Calling your thing R00
seems to me false advertising: it's behaviour doesn't correspond to a relation value with empty (0
) heading or empty (0
) body.