Proposed SAFEUNGROUP operator
Quote from johnwcowan on June 29, 2019, 1:28 amThe UNGROUP operator is not a safe operator; it cannot be applied to just any RVA. In particular, if the attribute names inside the RVA overlap at all with the attribute names of the relation holding the RVA, then the UNGROUP cannot be done until the conflicting names in the outer relation have been renamed away (since there is, AFAICT, no way to rename the inner ones). A second rename will then give all the attributes the desired final names. But this RENAME-UNGROUP-RENAME is both sufficiently annoying and overly general.
Therefore I am suggesting a slightly different operator called SAFEUNGROUP. It is not entirely safe either, but it's a lot better. The idea here is that if the RVA is named
foo
and its inner attributes are{bar, baz, zam}
, then they are ungrouped using the names{foo_bar, foo_baz, foo_zam}
. In particular, if there are two RVAs in the same relation that have the same type, then ungrouping both of them with UNGROUP is definitely going to produce a problem, but with SAFEUNGROUP it will work.Of course if by chance the outer relation already has attributes with these names, SAFEUNGROUP will fail. If another character such as $ were reserved for the purpose and never used for user-chosen attribute names, that source of conflict would be eliminated.
The UNGROUP operator is not a safe operator; it cannot be applied to just any RVA. In particular, if the attribute names inside the RVA overlap at all with the attribute names of the relation holding the RVA, then the UNGROUP cannot be done until the conflicting names in the outer relation have been renamed away (since there is, AFAICT, no way to rename the inner ones). A second rename will then give all the attributes the desired final names. But this RENAME-UNGROUP-RENAME is both sufficiently annoying and overly general.
Therefore I am suggesting a slightly different operator called SAFEUNGROUP. It is not entirely safe either, but it's a lot better. The idea here is that if the RVA is named foo
and its inner attributes are {bar, baz, zam}
, then they are ungrouped using the names {foo_bar, foo_baz, foo_zam}
. In particular, if there are two RVAs in the same relation that have the same type, then ungrouping both of them with UNGROUP is definitely going to produce a problem, but with SAFEUNGROUP it will work.
Of course if by chance the outer relation already has attributes with these names, SAFEUNGROUP will fail. If another character such as $ were reserved for the purpose and never used for user-chosen attribute names, that source of conflict would be eliminated.
Quote from Darren Duncan on June 29, 2019, 2:15 amI came up with what I think is a better alternative years ago. I made group/ungroup into more fundamental basic operators that are intended to be combined with other operators like project and extend to do everything Tutorial D's group/ungroup do. My group/ungroup are each unary operators that take a single binary relation with a specific heading and result in a binary relation with a specific heading. The input to group has 2 TVAs and outputs 1 TVA + 1 RVA, while ungroup does the opposite. The attribute that remains a TVA is the one being grouped-by, and the one that changes to/from an RVA is the one being grouped/ungrouped. With this design, group/ungroup are guaranteed to be safe, and any possible collisions are preventable by higher-level code that uses them.
I came up with what I think is a better alternative years ago. I made group/ungroup into more fundamental basic operators that are intended to be combined with other operators like project and extend to do everything Tutorial D's group/ungroup do. My group/ungroup are each unary operators that take a single binary relation with a specific heading and result in a binary relation with a specific heading. The input to group has 2 TVAs and outputs 1 TVA + 1 RVA, while ungroup does the opposite. The attribute that remains a TVA is the one being grouped-by, and the one that changes to/from an RVA is the one being grouped/ungrouped. With this design, group/ungroup are guaranteed to be safe, and any possible collisions are preventable by higher-level code that uses them.
Quote from johnwcowan on June 29, 2019, 2:52 amQuote from Darren Duncan on June 29, 2019, 2:15 amMy group/ungroup are each unary operators that take a single binary relation with a specific heading and result in a binary relation with a specific heading.
Makes sense, but what specific heading do you mean? Are the attribute names of the result specified by the user, as in extend or summarize?
Quote from Darren Duncan on June 29, 2019, 2:15 amMy group/ungroup are each unary operators that take a single binary relation with a specific heading and result in a binary relation with a specific heading.
Makes sense, but what specific heading do you mean? Are the attribute names of the result specified by the user, as in extend or summarize?
Quote from Darren Duncan on June 29, 2019, 4:36 amQuote from johnwcowan on June 29, 2019, 2:52 amQuote from Darren Duncan on June 29, 2019, 2:15 amMy group/ungroup are each unary operators that take a single binary relation with a specific heading and result in a binary relation with a specific heading.
Makes sense, but what specific heading do you mean? Are the attribute names of the result specified by the user, as in extend or summarize?
The heading is system-defined or that is, defined by the group/ungroup operators themselves, and the user has no choice. The headings are { group: TVA, member: TVA } and { group: TVA, members: RVA } respectively. See also http://muldis.com/Muldis_Data_Language.html for several semi-outdated examples of my group/ungroup operators in use.
Also, in my terminology, user-specified attribute names or the like in extend/summarize are additional arguments, and so when I say unary operator I mean that the single input relation is the only input to the operators full stop.
Quote from johnwcowan on June 29, 2019, 2:52 amQuote from Darren Duncan on June 29, 2019, 2:15 amMy group/ungroup are each unary operators that take a single binary relation with a specific heading and result in a binary relation with a specific heading.
Makes sense, but what specific heading do you mean? Are the attribute names of the result specified by the user, as in extend or summarize?
The heading is system-defined or that is, defined by the group/ungroup operators themselves, and the user has no choice. The headings are { group: TVA, member: TVA } and { group: TVA, members: RVA } respectively. See also http://muldis.com/Muldis_Data_Language.html for several semi-outdated examples of my group/ungroup operators in use.
Also, in my terminology, user-specified attribute names or the like in extend/summarize are additional arguments, and so when I say unary operator I mean that the single input relation is the only input to the operators full stop.
Quote from dandl on June 29, 2019, 6:47 amQuote from johnwcowan on June 29, 2019, 1:28 amThe UNGROUP operator is not a safe operator; it cannot be applied to just any RVA. In particular, if the attribute names inside the RVA overlap at all with the attribute names of the relation holding the RVA, then the UNGROUP cannot be done until the conflicting names in the outer relation have been renamed away (since there is, AFAICT, no way to rename the inner ones). A second rename will then give all the attributes the desired final names. But this RENAME-UNGROUP-RENAME is both sufficiently annoying and overly general.
Therefore I am suggesting a slightly different operator called SAFEUNGROUP. It is not entirely safe either, but it's a lot better. The idea here is that if the RVA is named
foo
and its inner attributes are{bar, baz, zam}
, then they are ungrouped using the names{foo_bar, foo_baz, foo_zam}
. In particular, if there are two RVAs in the same relation that have the same type, then ungrouping both of them with UNGROUP is definitely going to produce a problem, but with SAFEUNGROUP it will work.Of course if by chance the outer relation already has attributes with these names, SAFEUNGROUP will fail. If another character such as $ were reserved for the purpose and never used for user-chosen attribute names, that source of conflict would be eliminated.
Do not like. Name conflicts can always be detected at compile time. I would prefer a compile-time mechanism to resolve such conflicts by explicitly renaming inner attributes either automatically (bar1) or syntactically (rename bar as new_bar). See the problem, fix the problem, move on.
Quote from johnwcowan on June 29, 2019, 1:28 amThe UNGROUP operator is not a safe operator; it cannot be applied to just any RVA. In particular, if the attribute names inside the RVA overlap at all with the attribute names of the relation holding the RVA, then the UNGROUP cannot be done until the conflicting names in the outer relation have been renamed away (since there is, AFAICT, no way to rename the inner ones). A second rename will then give all the attributes the desired final names. But this RENAME-UNGROUP-RENAME is both sufficiently annoying and overly general.
Therefore I am suggesting a slightly different operator called SAFEUNGROUP. It is not entirely safe either, but it's a lot better. The idea here is that if the RVA is named
foo
and its inner attributes are{bar, baz, zam}
, then they are ungrouped using the names{foo_bar, foo_baz, foo_zam}
. In particular, if there are two RVAs in the same relation that have the same type, then ungrouping both of them with UNGROUP is definitely going to produce a problem, but with SAFEUNGROUP it will work.Of course if by chance the outer relation already has attributes with these names, SAFEUNGROUP will fail. If another character such as $ were reserved for the purpose and never used for user-chosen attribute names, that source of conflict would be eliminated.
Do not like. Name conflicts can always be detected at compile time. I would prefer a compile-time mechanism to resolve such conflicts by explicitly renaming inner attributes either automatically (bar1) or syntactically (rename bar as new_bar). See the problem, fix the problem, move on.
Quote from AntC on June 29, 2019, 10:26 amThumbs down: the point about
GROUP/UNGROUP
is thatGROUP
is information-preserving, andUNGROUP
is its inverse. (That is, if you provide the inverse parameters toUNGROUP
.) Yes you can mess up theUNGROUP
ing, so it isn't a total function. But that can be detected statically at compile time, as @david-bennett-2 points out.Not sure what you mean by "(un)safe". Don't you just mean (not) total? Since the proposed
SAFEUNGROUP
is also potentially non-total, I'd prefer the status quoUNGROUP
with static checking.
Thumbs down: the point about GROUP/UNGROUP
is that GROUP
is information-preserving, and UNGROUP
is its inverse. (That is, if you provide the inverse parameters to UNGROUP
.) Yes you can mess up the UNGROUP
ing, so it isn't a total function. But that can be detected statically at compile time, as @david-bennett-2 points out.
Not sure what you mean by "(un)safe". Don't you just mean (not) total? Since the proposed SAFEUNGROUP
is also potentially non-total, I'd prefer the status quo UNGROUP
with static checking.
Quote from Hugh on June 29, 2019, 1:19 pmQuote from johnwcowan on June 29, 2019, 1:28 amThe UNGROUP operator is not a safe operator; it cannot be applied to just any RVA. In particular, if the attribute names inside the RVA overlap at all with the attribute names of the relation holding the RVA, then the UNGROUP cannot be done until the conflicting names in the outer relation have been renamed away (since there is, AFAICT, no way to rename the inner ones). A second rename will then give all the attributes the desired final names. But this RENAME-UNGROUP-RENAME is both sufficiently annoying and overly general.
Therefore I am suggesting a slightly different operator called SAFEUNGROUP. It is not entirely safe either, but it's a lot better. The idea here is that if the RVA is named
foo
and its inner attributes are{bar, baz, zam}
, then they are ungrouped using the names{foo_bar, foo_baz, foo_zam}
. In particular, if there are two RVAs in the same relation that have the same type, then ungrouping both of them with UNGROUP is definitely going to produce a problem, but with SAFEUNGROUP it will work.Of course if by chance the outer relation already has attributes with these names, SAFEUNGROUP will fail. If another character such as $ were reserved for the purpose and never used for user-chosen attribute names, that source of conflict would be eliminated.
Sympathise with the problem but not sure it's worth fixing, especially if the fix can't guarantee to be "safe" (I'm uncomfortable with the use of that word, which has other connotations in the database field). John's example doesn't work if an attribute named foo_bar already exists. I'd use RENAME with the PREFIX option, specifying a prefix that I "know" will be "safe". We introduced the PREFIX and SUFFIX options in an attempt to address the annoying-ness John mentions.
Hugh
Quote from johnwcowan on June 29, 2019, 1:28 amThe UNGROUP operator is not a safe operator; it cannot be applied to just any RVA. In particular, if the attribute names inside the RVA overlap at all with the attribute names of the relation holding the RVA, then the UNGROUP cannot be done until the conflicting names in the outer relation have been renamed away (since there is, AFAICT, no way to rename the inner ones). A second rename will then give all the attributes the desired final names. But this RENAME-UNGROUP-RENAME is both sufficiently annoying and overly general.
Therefore I am suggesting a slightly different operator called SAFEUNGROUP. It is not entirely safe either, but it's a lot better. The idea here is that if the RVA is named
foo
and its inner attributes are{bar, baz, zam}
, then they are ungrouped using the names{foo_bar, foo_baz, foo_zam}
. In particular, if there are two RVAs in the same relation that have the same type, then ungrouping both of them with UNGROUP is definitely going to produce a problem, but with SAFEUNGROUP it will work.Of course if by chance the outer relation already has attributes with these names, SAFEUNGROUP will fail. If another character such as $ were reserved for the purpose and never used for user-chosen attribute names, that source of conflict would be eliminated.
Sympathise with the problem but not sure it's worth fixing, especially if the fix can't guarantee to be "safe" (I'm uncomfortable with the use of that word, which has other connotations in the database field). John's example doesn't work if an attribute named foo_bar already exists. I'd use RENAME with the PREFIX option, specifying a prefix that I "know" will be "safe". We introduced the PREFIX and SUFFIX options in an attempt to address the annoying-ness John mentions.
Hugh
Quote from johnwcowan on June 29, 2019, 3:13 pmQuote from Hugh on June 29, 2019, 1:19 pmSympathise with the problem but not sure it's worth fixing, especially if the fix can't guarantee to be "safe" (I'm uncomfortable with the use of that word, which has other connotations in the database field).
Yes, I should have said "partial", as AntC says.
John's example doesn't work if an attribute named foo_bar already exists. I'd use RENAME with the PREFIX option, specifying a prefix that I "know" will be "safe". We introduced the PREFIX and SUFFIX options in an attempt to address the annoying-ness John mentions.
Ah, I didn't know about those. Come to think of it, you can use EXTEND to effectively rename the inside attributes.
A more solid gap that I saw mentioned somewhere in DE is that there is no operator that takes a relation of one tuple and returns the tuple.
Quote from Hugh on June 29, 2019, 1:19 pm
Sympathise with the problem but not sure it's worth fixing, especially if the fix can't guarantee to be "safe" (I'm uncomfortable with the use of that word, which has other connotations in the database field).
Yes, I should have said "partial", as AntC says.
John's example doesn't work if an attribute named foo_bar already exists. I'd use RENAME with the PREFIX option, specifying a prefix that I "know" will be "safe". We introduced the PREFIX and SUFFIX options in an attempt to address the annoying-ness John mentions.
Ah, I didn't know about those. Come to think of it, you can use EXTEND to effectively rename the inside attributes.
A more solid gap that I saw mentioned somewhere in DE is that there is no operator that takes a relation of one tuple and returns the tuple.
Quote from Hugh on June 29, 2019, 4:48 pmQuote from johnwcowan on June 29, 2019, 3:13 pmQuote from Hugh on June 29, 2019, 1:19 pmSympathise with the problem but not sure it's worth fixing, especially if the fix can't guarantee to be "safe" (I'm uncomfortable with the use of that word, which has other connotations in the database field).
Yes, I should have said "partial", as AntC says.
John's example doesn't work if an attribute named foo_bar already exists. I'd use RENAME with the PREFIX option, specifying a prefix that I "know" will be "safe". We introduced the PREFIX and SUFFIX options in an attempt to address the annoying-ness John mentions.
Ah, I didn't know about those. Come to think of it, you can use EXTEND to effectively rename the inside attributes.
A more solid gap that I saw mentioned somewhere in DE is that there is no operator that takes a relation of one tuple and returns the tuple.
Yes, there is! TUPLE FROM r.
Hugh
Quote from johnwcowan on June 29, 2019, 3:13 pmQuote from Hugh on June 29, 2019, 1:19 pmSympathise with the problem but not sure it's worth fixing, especially if the fix can't guarantee to be "safe" (I'm uncomfortable with the use of that word, which has other connotations in the database field).
Yes, I should have said "partial", as AntC says.
John's example doesn't work if an attribute named foo_bar already exists. I'd use RENAME with the PREFIX option, specifying a prefix that I "know" will be "safe". We introduced the PREFIX and SUFFIX options in an attempt to address the annoying-ness John mentions.
Ah, I didn't know about those. Come to think of it, you can use EXTEND to effectively rename the inside attributes.
A more solid gap that I saw mentioned somewhere in DE is that there is no operator that takes a relation of one tuple and returns the tuple.
Yes, there is! TUPLE FROM r.
Hugh
Quote from Erwin on June 29, 2019, 7:23 pmQuote from johnwcowan on June 29, 2019, 1:28 am... the UNGROUP cannot be done until the conflicting names in the outer relation have been renamed away (since there is, AFAICT, no way to rename the inner ones). A second rename will then give all the attributes the desired final names. But this RENAME-UNGROUP-RENAME is both sufficiently annoying and overly general. ...
Not that it will make a substantial difference to the point you're making, but :
TRANSFORM (RVA_HOLDER , (ATTRS_TO_KEEP ... , REPLACING_RVA ( RENAME ( RVA_HELD, (RVA_INNER_ATTR_OLD_NAME , RVA_INNER_ATTR_NEW_NAME) ) ) ) )
The idea is to "replace" the RVA as it occurs with the "conflicting" names, with an RVA that is a rename of the one which has the conflicts. Tutorial D does not really have anything like my TRANSFORM shorthand, but it's achievable with EXTEND (with the RVA that is the rename of the existing one)/PROJECT (the existing one away).
Ah ... You thought of that in a later post.
Quote from johnwcowan on June 29, 2019, 1:28 am... the UNGROUP cannot be done until the conflicting names in the outer relation have been renamed away (since there is, AFAICT, no way to rename the inner ones). A second rename will then give all the attributes the desired final names. But this RENAME-UNGROUP-RENAME is both sufficiently annoying and overly general. ...
Not that it will make a substantial difference to the point you're making, but :
TRANSFORM (RVA_HOLDER , (ATTRS_TO_KEEP ... , REPLACING_RVA ( RENAME ( RVA_HELD, (RVA_INNER_ATTR_OLD_NAME , RVA_INNER_ATTR_NEW_NAME) ) ) ) )
The idea is to "replace" the RVA as it occurs with the "conflicting" names, with an RVA that is a rename of the one which has the conflicts. Tutorial D does not really have anything like my TRANSFORM shorthand, but it's achievable with EXTEND (with the RVA that is the rename of the existing one)/PROJECT (the existing one away).
Ah ... You thought of that in a later post.