The Forum for Discussion about The Third Manifesto and Related Matters

Please or Register to create posts and topics.

Looking for a reference/piece of terminology: 'generative' dependencies(?)

Sorry this is a bit vague, I'm looking for a reference/explanation to help me be less vague.

EQDs, MVDs and JDs have the property that if the content of relvars in the schema don't conform to some such dependency, you can create tuples to make them conform. I think the 'chase' does that. ("create" because it might work instead to delete some tuples, but probably that'll cause some other violation.) An IND (aka Foreign Key Constraint) also requires keys to exist in the referenced relvar.

In contrast, FDs don't give any way to create tuples; if two tuples don't conform, an FD won't tell you which is right; adding a tuple won't help.

I think this tuple-creating property of EQDs, MVDs, JDs is called 'generative'.

There's a hint in the Darwen, Date, Fagin PJNF paper: Section 1.1 Related work "a minimal subset of the tuples of r that generates r via the chase."

What am I half-remembering; and where am I remembering it from?

Quote from AntC on April 26, 2020, 11:46 am

Sorry this is a bit vague, I'm looking for a reference/explanation to help me be less vague.

EQDs, MVDs and JDs have the property that if the content of relvars in the schema don't conform to some such dependency, you can create tuples to make them conform. I think the 'chase' does that. ("create" because it might work instead to delete some tuples, but probably that'll cause some other violation.) An IND (aka Foreign Key Constraint) also requires keys to exist in the referenced relvar.

In contrast, FDs don't give any way to create tuples; if two tuples don't conform, an FD won't tell you which is right; adding a tuple won't help.

I think this tuple-creating property of EQDs, MVDs, JDs is called 'generative'.

There's a hint in the Darwen, Date, Fagin PJNF paper: Section 1.1 Related work "a minimal subset of the tuples of r that generates r via the chase."

What am I half-remembering; and where am I remembering it from?

Sorry that I can't make much sense of your question.  There's no such thing as "creating a tuple" any more than there is of creating a number.

JDs give a generalization of MVDs, which give a generalization of FDs.  So if we just think of JDs we can be sure of covering that field.  A JD in relvar R is eliminated by decomposition to give at least two relvars, the union of whose headings is the heading of R.  The relations to be assigned to those relvars are the obvious projections on R.  Those projections derive sets of tuples from an existing set of tuples.  Perhaps you think of storing those tuples in the database as "creating tuples", but I would advise against use of such terminology as possibly creating confusion.

Your comment about FDs "requiring" keys to exist is not exactly incorrect but it is a strange statement to make, considering that it's a mathematical certainty that every relvar has at least one key.  Perhaps you are thinking of SQL, which violates the rule that the entire heading necessarily constitutes a superkey.  By definition a superkey is a superset of a key, so at the very least the heading itself satisfies the uniqueness property of a key (and is a subset of itself, of course).  And if no proper subset of the heading is a key, then the heading is the only key.

I haven't properly studied any theory surrounding EQDs, but it's clear to me that some can be dealt with by the opposite of decomposition.  Suppose, for example, that we decompose CJD's S relvar (suppliers with numbers, names, statuses, and cities) into 6NF (so at least three relvars, each keyed on {SNO}).   If it is the case in the real world that every supplier who has a name also has a status, then we need a constraint requiring the projections of two of those relvars on {SNO} always to be equal.  If we combine {SNO, SNAME} and {SNO, STATUS} into a single relvar, using JOIN to obtain the relation to be assigned, then we have eliminated that EQD.

I hope this helps.

Hugh

Coauthor of The Third Manifesto and related books.
Quote from Hugh on April 27, 2020, 12:16 pm
Quote from AntC on April 26, 2020, 11:46 am

Sorry this is a bit vague, I'm looking for a reference/explanation to help me be less vague.

EQDs, MVDs and JDs have the property that if the content of relvars in the schema don't conform to some such dependency, you can create tuples to make them conform. I think the 'chase' does that. ("create" because it might work instead to delete some tuples, but probably that'll cause some other violation.) An IND (aka Foreign Key Constraint) also requires keys to exist in the referenced relvar.

In contrast, FDs don't give any way to create tuples; if two tuples don't conform, an FD won't tell you which is right; adding a tuple won't help.

I think this tuple-creating property of EQDs, MVDs, JDs is called 'generative'.

There's a hint in the Darwen, Date, Fagin PJNF paper: Section 1.1 Related work "a minimal subset of the tuples of r that generates r via the chase."

What am I half-remembering; and where am I remembering it from?

Sorry that I can't make much sense of your question.  There's no such thing as "creating a tuple" any more than there is of creating a number.

JDs give a generalization of MVDs, which give a generalization of FDs.  So if we just think of JDs we can be sure of covering that field.  A JD in relvar R is eliminated by decomposition to give at least two relvars, the union of whose headings is the heading of R.  The relations to be assigned to those relvars are the obvious projections on R.  Those projections derive sets of tuples from an existing set of tuples.  Perhaps you think of storing those tuples in the database as "creating tuples", but I would advise against use of such terminology as possibly creating confusion.

Your comment about FDs "requiring" keys to exist is not exactly incorrect but it is a strange statement to make, considering that it's a mathematical certainty that every relvar has at least one key.  Perhaps you are thinking of SQL, which violates the rule that the entire heading necessarily constitutes a superkey.  By definition a superkey is a superset of a key, so at the very least the heading itself satisfies the uniqueness property of a key (and is a subset of itself, of course).  And if no proper subset of the heading is a key, then the heading is the only key.

I haven't properly studied any theory surrounding EQDs, but it's clear to me that some can be dealt with by the opposite of decomposition.  Suppose, for example, that we decompose CJD's S relvar (suppliers with numbers, names, statuses, and cities) into 6NF (so at least three relvars, each keyed on {SNO}).   If it is the case in the real world that every supplier who has a name also has a status, then we need a constraint requiring the projections of two of those relvars on {SNO} always to be equal.  If we combine {SNO, SNAME} and {SNO, STATUS} into a single relvar, using JOIN to obtain the relation to be assigned, then we have eliminated that EQD.

I hope this helps.

Hugh

I've just realised that my last paragraph, on EQDs, contains a mistake. The sentence "If it is the case in the real world that every supplier who has a name also has a status, then we need a constraint requiring the projections of two of those relvars on {SNO} always to be equal. " should have been "If it is the case in the real world that at all times every supplier who has a name also has a status and every supplier who has a status also has a name, then we need a constraint requiring the projections of two of those relvars on {SNO} always to be equal. "

Hugh

Coauthor of The Third Manifesto and related books.
Quote from Hugh on April 27, 2020, 12:16 pm
Quote from AntC on April 26, 2020, 11:46 am

Sorry this is a bit vague, I'm looking for a reference/explanation to help me be less vague.

EQDs, MVDs and JDs have the property that if the content of relvars in the schema don't conform to some such dependency, you can create tuples to make them conform. I think the 'chase' does that. ("create" because it might work instead to delete some tuples, but probably that'll cause some other violation.) An IND (aka Foreign Key Constraint) also requires keys to exist in the referenced relvar.

In contrast, FDs don't give any way to create tuples; if two tuples don't conform, an FD won't tell you which is right; adding a tuple won't help.

I think this tuple-creating property of EQDs, MVDs, JDs is called 'generative'.

There's a hint in the Darwen, Date, Fagin PJNF paper: Section 1.1 Related work "a minimal subset of the tuples of r that generates r via the chase."

What am I half-remembering; and where am I remembering it from?

Sorry that I can't make much sense of your question.  There's no such thing as "creating a tuple" any more than there is of creating a number.

Thank you Hugh, and apologies I've been slow in replying. I've been scouring my resources to jog my memory, but no fresh illumination so far.

I agree "creating a tuple" is sloppy language. So I think the term is 'generate'; I'm not sure you'll think that less sloppy. Wikipedia on JDs says "A table  is subject to a join dependency if  can always be recreated by joining multiple tables ..."; on the Chase says "the projection of a relation schema ... can be recovered by rejoining the projections." The Tableau method seems to be all about recreating (or recovering) tuples.

Then if we're 'recreating' or 'recovering' a relation value, we must be (re)creating the tuple content of that value; which some normalisation/vertical partitioning/projection has cast asunder.

JDs give a generalization of MVDs, which give a generalization of FDs.  So if we just think of JDs we can be sure of covering that field.  A JD in relvar R is eliminated by decomposition to give at least two relvars, the union of whose headings is the heading of R.  The relations to be assigned to those relvars are the obvious projections on R.  Those projections derive sets of tuples from an existing set of tuples.  Perhaps you think of storing those tuples in the database as "creating tuples", but I would advise against use of such terminology as possibly creating confusion.

No I'm not thinking of concrete databases nor storing anything in them. This is about the abstract properties of schemata and dependencies.

Your comment about FDs "requiring" keys to exist ...

No I didn't say that of FDs; I said it of INDs. If some S# appears in relvar SP, that requires the S# also appear in S. But it doesn't tell us anything about the SNAME, STATUS, CITY to appear on the tuple with that S#. What I'm half-remembering said whereas JDs generate (recover) whole tuples, INDs 'generate' requirements for values (but not whole tuple content) to appear in some other relvar. More fully: the appearance of values in tuples in the referencing relvar generates such a requirement for the referenced relvar.

is not exactly incorrect but it is a strange statement to make, considering that it's a mathematical certainty that every relvar has at least one key.  Perhaps you are thinking of SQL, which violates the rule that the entire heading necessarily constitutes a superkey.  By definition a superkey is a superset of a key, so at the very least the heading itself satisfies the uniqueness property of a key (and is a subset of itself, of course).  And if no proper subset of the heading is a key, then the heading is the only key.

I haven't properly studied any theory surrounding EQDs, but it's clear to me that some can be dealt with by the opposite of decomposition.  Suppose, for example, that we decompose CJD's S relvar (suppliers with numbers, names, statuses, and cities) into 6NF (so at least three relvars, each keyed on {SNO}).   If it is the case in the real world that every supplier who has a name also has a status, then we need a constraint requiring the projections of two of those relvars on {SNO} always to be equal.  If we combine {SNO, SNAME} and {SNO, STATUS} into a single relvar, using JOIN to obtain the relation to be assigned, then we have eliminated that EQD.

Yes and vice versa, normalisation eliminates JDs. The context of the discussion is unnormalised relvars/relation values. Once you've normalised the schema, of course you no longer have the 'wide' candidate relvars you started with, so you can no longer talk about a JD.

The pragmatic reason for going to 6NF with a physical design is precisely that that sort of EQD doesn't hold "at all times" -- to pick up on your emendation. We want to record SNO's SNAME even though we don't yet know their STATUS.

Quote from AntC on May 5, 2020, 12:20 am
Quote from Hugh on April 27, 2020, 12:16 pm
Quote from AntC on April 26, 2020, 11:46 am

Sorry this is a bit vague, I'm looking for a reference/explanation to help me be less vague.

EQDs, MVDs and JDs have the property that if the content of relvars in the schema don't conform to some such dependency, you can create tuples to make them conform. I think the 'chase' does that. ("create" because it might work instead to delete some tuples, but probably that'll cause some other violation.) An IND (aka Foreign Key Constraint) also requires keys to exist in the referenced relvar.

In contrast, FDs don't give any way to create tuples; if two tuples don't conform, an FD won't tell you which is right; adding a tuple won't help.

I think this tuple-creating property of EQDs, MVDs, JDs is called 'generative'.

There's a hint in the Darwen, Date, Fagin PJNF paper: Section 1.1 Related work "a minimal subset of the tuples of r that generates r via the chase."

What am I half-remembering; and where am I remembering it from?

I've received an email from a ghost, who says

Tuple generating vs equality generating dependencies. These are addressed in a lot of textbooks.There is a chapter interrelating a lot of kinds of dependencies including those in Alice. [Chapter 10]

Thank you ghost.

"The fundamental property of all of the dependencies introduced so far is that they essentially
say, “The presence of some tuples in the instance implies the presence of certain other
tuples in the instance, or implies that certain tuple components are equal.” In the case of
jd’s and mvd’s, the new tuples can be completely specified in terms of the old tuples, but
for ind’s this is not the case." [start of Section 10.1]

Talk of 'new tuples' and 'old tuples' seems to me just as sloppy as 'creating tuples'. Maybe: 'given tuples' vs 'implied tuples'/attribute values?

"Tuple generating versus equality generating: A tuple-generating dependency (tgd) is a
dependency in which ..."

I won't try to summarise that definition, it needs too much material/you can read that Chapter for yourselves.

 

The pragmatic reason for going to 6NF with a physical design is precisely that that sort of EQD doesn't hold "at all times" -- to pick up on your emendation. We want to record SNO's SNAME even though we don't yet know their STATUS.

The ghost reveals their identity:

No, normalization is not for that, that is a misconception. If you needed a relvar with more tuples than a projection of a relvar you are normalizing then the latter relvar is not an adequate design. People just might notice that they have that problem while normalizing. But any projection of the latter could have that problem regardless of projections that come to notice while normalizing.

 

Sorry that I somehow missed your reply on April 26th, Antc.  Anyway, it seems that somebody has invented some new terminology for concepts that we already had (more than?) enough terminology for.  It looks to me that "tgd" is equivalent to JD.

Hugh

Coauthor of The Third Manifesto and related books.
Quote from Hugh on May 6, 2020, 11:42 am

Anyway, it seems that somebody has invented some new terminology for concepts that we already had (more than?) enough terminology for.  It looks to me that "tgd" is equivalent to JD.

 

Not equivalent: all JDs are 'tgd's, but not vice versa. Alice Chapter 10 is taking "A Larger Perspective" [the Chapter's title] over dependencies in general; and providing "A Unifying Framework" [Section 1's title] of expressing all dependencies in first-order logic sentences of a canonical form.

Then there are sentences you could express in that form, and that count as 'tgd's, but aren't JDs. OTOH whether they'd be useful/intuitive/realistically applicable/etc is left as an exercise for the reader. It all seems rather moot when you're lucky to get more in an average DBMS than Domain Dependencies, Tuple Dependencies, Keys and Foreign Key constraints (not other forms of INDs, not Exclusion Dependencies, for example).