The Forum for Discussion about The Third Manifesto and Related Matters

You need to log in to create posts and topics.

ANNOUNCE: Rel version 3.013 now available!

Quote from Dave Voorhis on December 16, 2018, 10:37 am
Quote from Brian S on December 16, 2018, 5:27 am

Are you saying that REL doesn't support relation-valued attributes?

The logic of missing information is actually quite clear: (1) there is an n-ary predicate; (2) it has been asserted that there is an instantiation of that n-ary predicate for a particular key value; (3) it is not clear precisely which instantiation satisfies that n-ary predicate for that particular key value.

There is an existentially-quantified sentence that is true for each and every dependent component: there is one and only one x such that P(k, x).  This is the case regardless of whether it is known what that dependent component actually is.

Rel supports relation-valued attributes. They should not be used to re-invent SQL NULL.

I reject the notion that "missing information" needs to be recorded in a relational database. What does need to be recorded, in some cases, is the entirely non-missing fact that a particular attribute value is not within its usual domain.

For example, we may need to record the fact that a temperature reading did not find a certain value in degrees Celsius, and instead found a broken thermometer or defective sensor. What we actually record in a relational database are readings about the real (or a fictional, but notionally veridical) world, the domains of which can be wholly described by appropriate types.

A truly "missing" reading is not recorded, because by definition, it is missing. If we can record it, then by definition it is not missing.

Thus, "missing" can only be the absence of any record (a notion covered under the Closed World Assumption). Any present record is not missing.

Indeed, a record with "missing" information is a contradiction.

I disagree with your characterization of tuples as readings.  A tuple in an item master relation isn't a reading: it identifies a class of real-world objects which may not yet actually exist.  Until the first article has been produced, the item in question has no substance and exists only conceptually in the mind, and yet it can still appear in the database.

I disagree also with your characterization of the Closed World Assumption:  You can record what you know and you can record what you don't know, but under the Closed World Assumption, you can't record that you don't know what you don't know.

I think it's important to be able to record that there should be a value--even if it is not clear which one it is.  The presence of a tuple with a ZOO component asserts the fact that there should be a value, and a non-empty ZOO component declares which one it is.

The logic is in fact the logic of projection: "there is an x such that P(k, x)."  P(k, a) implies "there is an x such that P(k, x)," but it doesn't change the fact that P(k, a) and "there is an x such that P(k, x)" are distinct facts, and that "there is an x such that P(k, x)" can hold even if it is not clear whether P(k, a) holds, or P(k, b) holds, or P(k, c) holds.  This is the essence of missing information.  Whether information is in fact missing is distinct from the reason it is missing.  Your example above conflates the two concepts.

Quote from Brian S on December 22, 2018, 3:59 am
Quote from Dave Voorhis on December 16, 2018, 10:37 am
Quote from Brian S on December 16, 2018, 5:27 am

Are you saying that REL doesn't support relation-valued attributes?

The logic of missing information is actually quite clear: (1) there is an n-ary predicate; (2) it has been asserted that there is an instantiation of that n-ary predicate for a particular key value; (3) it is not clear precisely which instantiation satisfies that n-ary predicate for that particular key value.

There is an existentially-quantified sentence that is true for each and every dependent component: there is one and only one x such that P(k, x).  This is the case regardless of whether it is known what that dependent component actually is.

Rel supports relation-valued attributes. They should not be used to re-invent SQL NULL.

I reject the notion that "missing information" needs to be recorded in a relational database. What does need to be recorded, in some cases, is the entirely non-missing fact that a particular attribute value is not within its usual domain.

For example, we may need to record the fact that a temperature reading did not find a certain value in degrees Celsius, and instead found a broken thermometer or defective sensor. What we actually record in a relational database are readings about the real (or a fictional, but notionally veridical) world, the domains of which can be wholly described by appropriate types.

A truly "missing" reading is not recorded, because by definition, it is missing. If we can record it, then by definition it is not missing.

Thus, "missing" can only be the absence of any record (a notion covered under the Closed World Assumption). Any present record is not missing.

Indeed, a record with "missing" information is a contradiction.

I disagree with your characterization of tuples as readings.  A tuple in an item master relation isn't a reading: it identifies a class of real-world objects which may not yet actually exist.  Until the first article has been produced, the item in question has no substance and exists only conceptually in the mind, and yet it can still appear in the database.

I disagree also with your characterization of the Closed World Assumption:  You can record what you know and you can record what you don't know, but under the Closed World Assumption, you can't record that you don't know what you don't know.

I think it's important to be able to record that there should be a value--even if it is not clear which one it is.  The presence of a tuple with a ZOO component asserts the fact that there should be a value, and a non-empty ZOO component declares which one it is.

The logic is in fact the logic of projection: "there is an x such that P(k, x)."  P(k, a) implies "there is an x such that P(k, x)," but it doesn't change the fact that P(k, a) and "there is an x such that P(k, x)" are distinct facts, and that "there is an x such that P(k, x)" can hold even if it is not clear whether P(k, a) holds, or P(k, b) holds, or P(k, c) holds.  This is the essence of missing information.  Whether information is in fact missing is distinct from the reason it is missing.  Your example above conflates the two concepts.

Identifying a class of real-world objects which may not yet actually exist is just as much capturing data -- record keeping, or taking readings -- about the real (or a fictional) world as any other. If you can write it down on paper -- which you can certainly do with a class of real-world objects which may not yet actually exist -- then it is a reading, even if it is a reading of mind, so to speak.

If you can record what you don't know, then it isn't missing information. It's a record of what you do know, like the thermometer is broken rather than emitting Celsius temperature values. At that point, it's notionally equivalent whether you use a ZOO attribute is used to record the attempted temperature reading -- and you presumably throw away the fact that the thermometer is broken, perhaps because you only care about Celsius temperature values -- or you use a tagged union to record the specific fact that the thermometer is broken. Which approach you choose is a matter of programming language ergonomics and meeting requirements. Conceptually, the same information is recorded (or not recorded) in either approach. There is no logical distinction.

Regarding your last paragraph, if you can record "missing information" then it is, by definition, not missing. Describing it as "missing" is a mistake. I am not conflating non-missing information with missing information because the latter is an oxymoron, a self-contradiction. There is either recorded information or there is no record at all. Recording what you don't know is never "missing information"; it's a record of what you do know.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
Quote from Dave Voorhis on December 22, 2018, 11:02 am
Quote from Brian S on December 22, 2018, 3:59 am
Quote from Dave Voorhis on December 16, 2018, 10:37 am
Quote from Brian S on December 16, 2018, 5:27 am

Are you saying that REL doesn't support relation-valued attributes?

The logic of missing information is actually quite clear: (1) there is an n-ary predicate; (2) it has been asserted that there is an instantiation of that n-ary predicate for a particular key value; (3) it is not clear precisely which instantiation satisfies that n-ary predicate for that particular key value.

There is an existentially-quantified sentence that is true for each and every dependent component: there is one and only one x such that P(k, x).  This is the case regardless of whether it is known what that dependent component actually is.

Rel supports relation-valued attributes. They should not be used to re-invent SQL NULL.

I reject the notion that "missing information" needs to be recorded in a relational database. What does need to be recorded, in some cases, is the entirely non-missing fact that a particular attribute value is not within its usual domain.

For example, we may need to record the fact that a temperature reading did not find a certain value in degrees Celsius, and instead found a broken thermometer or defective sensor. What we actually record in a relational database are readings about the real (or a fictional, but notionally veridical) world, the domains of which can be wholly described by appropriate types.

A truly "missing" reading is not recorded, because by definition, it is missing. If we can record it, then by definition it is not missing.

Thus, "missing" can only be the absence of any record (a notion covered under the Closed World Assumption). Any present record is not missing.

Indeed, a record with "missing" information is a contradiction.

I disagree with your characterization of tuples as readings.  A tuple in an item master relation isn't a reading: it identifies a class of real-world objects which may not yet actually exist.  Until the first article has been produced, the item in question has no substance and exists only conceptually in the mind, and yet it can still appear in the database.

I disagree also with your characterization of the Closed World Assumption:  You can record what you know and you can record what you don't know, but under the Closed World Assumption, you can't record that you don't know what you don't know.

I think it's important to be able to record that there should be a value--even if it is not clear which one it is.  The presence of a tuple with a ZOO component asserts the fact that there should be a value, and a non-empty ZOO component declares which one it is.

The logic is in fact the logic of projection: "there is an x such that P(k, x)."  P(k, a) implies "there is an x such that P(k, x)," but it doesn't change the fact that P(k, a) and "there is an x such that P(k, x)" are distinct facts, and that "there is an x such that P(k, x)" can hold even if it is not clear whether P(k, a) holds, or P(k, b) holds, or P(k, c) holds.  This is the essence of missing information.  Whether information is in fact missing is distinct from the reason it is missing.  Your example above conflates the two concepts.

Identifying a class of real-world objects which may not yet actually exist is just as much capturing data -- record keeping, or taking readings -- about the real (or a fictional) world as any other. If you can write it down on paper -- which you can certainly do with a class of real-world objects which may not yet actually exist -- then it is a reading, even if it is a reading of mind, so to speak.

If can record what you don't know, then it isn't missing information. It's a record of what you do know, like the thermometer is broken rather than emitting Celsius temperature values. At that point, it's notionally equivalent whether you use a ZOO attribute is used to record the attempted temperature reading -- and you presumably throw away the fact that the thermometer is broken, perhaps because you only care about Celsius temperature values -- or you use a tagged union to record the specific fact that the thermometer is broken. Which approach you choose is a matter of programming language ergonomics and meeting requirements. Conceptually, the same information is recorded (or not recorded) in either approach. There is no logical distinction.

Regarding your last paragraph, if you can record "missing information" then it is, by definition, not missing. Describing it as "missing" is a mistake. I am not conflating non-missing information with missing information because the latter is an oxymoron, a self-contradiction. There is either recorded information or there is no record at all. Recording what you don't know is never "missing information"; it's a record of what you do know.

I don't see it that way: what is being recorded is that there is a particular value from a specific domain that satisfies the predicate for a given key value.  It's just not clear which particular value satisfies the predicate for that given key value when the ZOO component is empty.  It is known, however, by the user inserting the tuple that there is one.

I think it is a mistake to have multiple relvars that employ the same predicate symbol.  Why?  Because the scope of application of the closed world assumption is each relvar, and as a result, the presence of a tuple in another relvar with the predicate, "there is an x such that P(k, x)...," is a clear contradiction of what can be deduced from the absence of a tuple in a relvar with the predicate P(k, x) under the closed world assumption.  It also negatively affects the semantics of projection:  The projection over k on the relvar with predicate P(k, x) no longer has the predicate "there is an x such that P(k, x)" under the closed world assumption."  The absence of a tuple in that projection for a given k no longer implies that there isn't an x such that P(k, x).  The user has to also look elsewhere in the database to find that out.

It is better, therefore, to house instances of P(k, x) and "there is an x such that P(k, x)" in the same relvar.

Quote from Brian S on December 28, 2018, 12:10 pm
Quote from Dave Voorhis on December 22, 2018, 11:02 am
Quote from Brian S on December 22, 2018, 3:59 am

 

I think it's important to be able to record that there should be a value--even if it is not clear which one it is.  The presence of a tuple with a ZOO component asserts the fact that there should be a value, and a non-empty ZOO component declares which one it is.

The logic is in fact the logic of projection: "there is an x such that P(k, x)."  P(k, a) implies "there is an x such that P(k, x)," but it doesn't change the fact that P(k, a) and "there is an x such that P(k, x)" are distinct facts, and that "there is an x such that P(k, x)" can hold even if it is not clear whether P(k, a) holds, or P(k, b) holds, or P(k, c) holds.  This is the essence of missing information.  Whether information is in fact missing is distinct from the reason it is missing.  Your example above conflates the two concepts.

 

I don't see it that way: what is being recorded is that there is a particular value from a specific domain that satisfies the predicate for a given key value.  It's just not clear which particular value satisfies the predicate for that given key value when the ZOO component is empty.  It is known, however, by the user inserting the tuple that there is one.

I think it is a mistake to have multiple relvars that employ the same predicate symbol.  Why?  Because the scope of application of the closed world assumption is each relvar, and as a result, the presence of a tuple in another relvar with the predicate, "there is an x such that P(k, x)...," is a clear contradiction of what can be deduced from the absence of a tuple in a relvar with the predicate P(k, x) under the closed world assumption.  It also negatively affects the semantics of projection:  The projection over k on the relvar with predicate P(k, x) no longer has the predicate "there is an x such that P(k, x)" under the closed world assumption."  The absence of a tuple in that projection for a given k no longer implies that there isn't an x such that P(k, x).  The user has to also look elsewhere in the database to find that out.

It is better, therefore, to house instances of P(k, x) and "there is an x such that P(k, x)" in the same relvar.

If you are facing a design situation where "known-ness of x" is a relevant/distinguishing property then your predicates must express that concept of "known-ness" somewhere somehow.  Yours don't.

Quote from Erwin on December 29, 2018, 1:28 pm
Quote from Brian S on December 28, 2018, 12:10 pm

 

P(k, a) and "there is an x such that P(k, x)" are distinct facts

I think it is a mistake to have multiple relvars that employ the same predicate symbol.

It is better, therefore, to house instances of P(k, x) and "there is an x such that P(k, x)" in the same relvar.

 

Besides, I think it is a mistake to use the same predicate symbol for distinct predicates.

Quote from Erwin on December 29, 2018, 1:28 pm
Quote from Brian S on December 28, 2018, 12:10 pm
Quote from Dave Voorhis on December 22, 2018, 11:02 am
Quote from Brian S on December 22, 2018, 3:59 am

 

I think it's important to be able to record that there should be a value--even if it is not clear which one it is.  The presence of a tuple with a ZOO component asserts the fact that there should be a value, and a non-empty ZOO component declares which one it is.

The logic is in fact the logic of projection: "there is an x such that P(k, x)."  P(k, a) implies "there is an x such that P(k, x)," but it doesn't change the fact that P(k, a) and "there is an x such that P(k, x)" are distinct facts, and that "there is an x such that P(k, x)" can hold even if it is not clear whether P(k, a) holds, or P(k, b) holds, or P(k, c) holds.  This is the essence of missing information.  Whether information is in fact missing is distinct from the reason it is missing.  Your example above conflates the two concepts.

 

I don't see it that way: what is being recorded is that there is a particular value from a specific domain that satisfies the predicate for a given key value.  It's just not clear which particular value satisfies the predicate for that given key value when the ZOO component is empty.  It is known, however, by the user inserting the tuple that there is one.

I think it is a mistake to have multiple relvars that employ the same predicate symbol.  Why?  Because the scope of application of the closed world assumption is each relvar, and as a result, the presence of a tuple in another relvar with the predicate, "there is an x such that P(k, x)...," is a clear contradiction of what can be deduced from the absence of a tuple in a relvar with the predicate P(k, x) under the closed world assumption.  It also negatively affects the semantics of projection:  The projection over k on the relvar with predicate P(k, x) no longer has the predicate "there is an x such that P(k, x)" under the closed world assumption."  The absence of a tuple in that projection for a given k no longer implies that there isn't an x such that P(k, x).  The user has to also look elsewhere in the database to find that out.

It is better, therefore, to house instances of P(k, x) and "there is an x such that P(k, x)" in the same relvar.

If you are facing a design situation where "known-ness of x" is a relevant/distinguishing property then your predicates must express that concept of "known-ness" somewhere somehow.  Yours don't.

I disagree.  There is no need for an "it is known to be the case that ..." modality.  The instantiation of the predicate having just the components of a key as parameters implies that there is exactly one instantiation of each predicate having a proper superset of the components of a key--even if it isn't clear which values actually instantiate the predicate.  It would therefore be redundant to state "it is known to be the case that Pka(k, a)" in addition to stating "Pka(k, a)."   The sentence,

∀k [Pk(k) → ∃1a Pka(k, a)].

is a logical consequence of the structure of a relvar having attributes k and a such that k → a.  The existence of predicates Pk and Pka is also a logical consequence of the functional dependency.  This is the case regardless of whether a value has been supplied for each a.  The modality you're suggesting is therefore superfluous.  The database represents a collection of definite facts.  Indefinite information is not directly represented by values in the database, but is instead implied by the instantiation of predicates like Pk.  For example, given the FD EMP# -> SSAN, and a tuple containing EMP#(1234), we can deduce that the employee having EMP#(1234) has exactly one social security number, even if no SSAN value has been supplied yet.  The presence of an explicit SSAN value for a given employee doesn't alter the fact that the employee in question has exactly one social security number.  Moreover, the fact that the employee in question has exactly one social security value combines with the fact that the employee has a specified social security number to deny all other instantiations of the predicate "employee EMP# has social security number SSAN" for that employee.  It is not just the closed world assumption at work here, but rather the implication that is a logical consequence of the functional dependency.

Given, P(a), and there is one and only one x such that P(x), one can conclude that forall x, P(x) -> x = a.

Brian