The Forum for Discussion about The Third Manifesto and Related Matters

You need to log in to create posts and topics.

Proposed SAFEUNGROUP operator

12

The UNGROUP operator is not a safe operator; it cannot be applied to just any RVA.  In particular, if the attribute names inside the RVA overlap at all with the attribute names of the relation holding the RVA, then the UNGROUP cannot be done until the conflicting names in the outer relation have been renamed away (since there is, AFAICT, no way to rename the inner ones).  A second rename will then give all the attributes the desired final names.  But this RENAME-UNGROUP-RENAME is both sufficiently annoying and overly general.

Therefore I am suggesting a slightly different operator called SAFEUNGROUP.  It is not entirely safe either, but it's a lot better.  The idea here is that if the RVA is named foo and its inner attributes are {bar, baz, zam}, then they are ungrouped using the names {foo_bar, foo_baz, foo_zam}.  In particular, if there are two RVAs in the same relation that have the same type, then ungrouping both of them with UNGROUP is definitely going to produce a problem, but with SAFEUNGROUP it will work.

Of course if by chance the outer relation already has attributes with these names, SAFEUNGROUP will fail.  If another character such as $ were reserved for the purpose and never used for user-chosen attribute names, that source of conflict would be eliminated.

I came up with what I think is a better alternative years ago.  I made group/ungroup into more fundamental basic operators that are intended to be combined with other operators like project and extend to do everything Tutorial D's group/ungroup do.  My group/ungroup are each unary operators that take a single binary relation with a specific heading and result in a binary relation with a specific heading.  The input to group has 2 TVAs and outputs 1 TVA + 1 RVA, while ungroup does the opposite.  The attribute that remains a TVA is the one being grouped-by, and the one that changes to/from an RVA is the one being grouped/ungrouped.  With this design, group/ungroup are guaranteed to be safe, and any possible collisions are preventable by higher-level code that uses them.

Quote from Darren Duncan on June 29, 2019, 2:15 am

My group/ungroup are each unary operators that take a single binary relation with a specific heading and result in a binary relation with a specific heading.

Makes sense, but what specific heading do you mean?  Are the attribute names of the result specified by the user, as in extend or summarize?

Quote from johnwcowan on June 29, 2019, 2:52 am
Quote from Darren Duncan on June 29, 2019, 2:15 am

My group/ungroup are each unary operators that take a single binary relation with a specific heading and result in a binary relation with a specific heading.

Makes sense, but what specific heading do you mean?  Are the attribute names of the result specified by the user, as in extend or summarize?

The heading is system-defined or that is, defined by the group/ungroup operators themselves, and the user has no choice.  The headings are { group: TVA, member: TVA } and { group: TVA, members: RVA } respectively.  See also http://muldis.com/Muldis_Data_Language.html for several semi-outdated examples of my group/ungroup operators in use.

Also, in my terminology, user-specified attribute names or the like in extend/summarize are additional arguments, and so when I say unary operator I mean that the single input relation is the only input to the operators full stop.

Quote from johnwcowan on June 29, 2019, 1:28 am

The UNGROUP operator is not a safe operator; it cannot be applied to just any RVA.  In particular, if the attribute names inside the RVA overlap at all with the attribute names of the relation holding the RVA, then the UNGROUP cannot be done until the conflicting names in the outer relation have been renamed away (since there is, AFAICT, no way to rename the inner ones).  A second rename will then give all the attributes the desired final names.  But this RENAME-UNGROUP-RENAME is both sufficiently annoying and overly general.

Therefore I am suggesting a slightly different operator called SAFEUNGROUP.  It is not entirely safe either, but it's a lot better.  The idea here is that if the RVA is named foo and its inner attributes are {bar, baz, zam}, then they are ungrouped using the names {foo_bar, foo_baz, foo_zam}.  In particular, if there are two RVAs in the same relation that have the same type, then ungrouping both of them with UNGROUP is definitely going to produce a problem, but with SAFEUNGROUP it will work.

Of course if by chance the outer relation already has attributes with these names, SAFEUNGROUP will fail.  If another character such as $ were reserved for the purpose and never used for user-chosen attribute names, that source of conflict would be eliminated.

Do not like. Name conflicts can always be detected at compile time. I would prefer a compile-time mechanism to resolve such conflicts by explicitly renaming inner attributes either automatically (bar1) or syntactically (rename bar as new_bar). See the problem, fix the problem, move on.

Andl - A New Database Language - andl.org

Thumbs down: the point about GROUP/UNGROUP is that GROUP is information-preserving, and UNGROUP is its inverse. (That is, if you provide the inverse parameters to UNGROUP.) Yes you can mess up the UNGROUPing, so it isn't a total function. But that can be detected statically at compile time, as @david-bennett-2 points out.

Not sure what you mean by "(un)safe". Don't you just mean (not) total? Since the proposed SAFEUNGROUP is also potentially non-total, I'd prefer the status quo UNGROUP with static checking.

Quote from johnwcowan on June 29, 2019, 1:28 am

The UNGROUP operator is not a safe operator; it cannot be applied to just any RVA.  In particular, if the attribute names inside the RVA overlap at all with the attribute names of the relation holding the RVA, then the UNGROUP cannot be done until the conflicting names in the outer relation have been renamed away (since there is, AFAICT, no way to rename the inner ones).  A second rename will then give all the attributes the desired final names.  But this RENAME-UNGROUP-RENAME is both sufficiently annoying and overly general.

Therefore I am suggesting a slightly different operator called SAFEUNGROUP.  It is not entirely safe either, but it's a lot better.  The idea here is that if the RVA is named foo and its inner attributes are {bar, baz, zam}, then they are ungrouped using the names {foo_bar, foo_baz, foo_zam}.  In particular, if there are two RVAs in the same relation that have the same type, then ungrouping both of them with UNGROUP is definitely going to produce a problem, but with SAFEUNGROUP it will work.

Of course if by chance the outer relation already has attributes with these names, SAFEUNGROUP will fail.  If another character such as $ were reserved for the purpose and never used for user-chosen attribute names, that source of conflict would be eliminated.

Sympathise with the problem but not sure it's worth fixing, especially if the fix can't guarantee to be "safe" (I'm uncomfortable with the use of that word, which has other connotations in the database field).  John's example  doesn't work if an attribute named foo_bar already exists.  I'd use RENAME with the PREFIX option, specifying a prefix that I "know" will be "safe".  We introduced the PREFIX and SUFFIX options in an attempt to address the annoying-ness John mentions.

Hugh

 

Coauthor of The Third Manifesto and related books.
Quote from Hugh on June 29, 2019, 1:19 pm

Sympathise with the problem but not sure it's worth fixing, especially if the fix can't guarantee to be "safe" (I'm uncomfortable with the use of that word, which has other connotations in the database field).

Yes, I should have said "partial", as AntC says.

John's example  doesn't work if an attribute named foo_bar already exists.  I'd use RENAME with the PREFIX option, specifying a prefix that I "know" will be "safe".  We introduced the PREFIX and SUFFIX options in an attempt to address the annoying-ness John mentions.

Ah, I didn't know about those.  Come to think of it, you can use EXTEND to effectively rename the inside attributes.

A more solid gap that I saw mentioned somewhere in DE is that there is no operator that takes a relation of one tuple and returns the tuple.

 

Quote from johnwcowan on June 29, 2019, 3:13 pm
Quote from Hugh on June 29, 2019, 1:19 pm

Sympathise with the problem but not sure it's worth fixing, especially if the fix can't guarantee to be "safe" (I'm uncomfortable with the use of that word, which has other connotations in the database field).

Yes, I should have said "partial", as AntC says.

John's example  doesn't work if an attribute named foo_bar already exists.  I'd use RENAME with the PREFIX option, specifying a prefix that I "know" will be "safe".  We introduced the PREFIX and SUFFIX options in an attempt to address the annoying-ness John mentions.

Ah, I didn't know about those.  Come to think of it, you can use EXTEND to effectively rename the inside attributes.

A more solid gap that I saw mentioned somewhere in DE is that there is no operator that takes a relation of one tuple and returns the tuple.

 

Yes, there is!  TUPLE FROM r.

Hugh

Coauthor of The Third Manifesto and related books.
Quote from johnwcowan on June 29, 2019, 1:28 am

... the UNGROUP cannot be done until the conflicting names in the outer relation have been renamed away (since there is, AFAICT, no way to rename the inner ones).  A second rename will then give all the attributes the desired final names.  But this RENAME-UNGROUP-RENAME is both sufficiently annoying and overly general. ...

Not that it will make a substantial difference to the point you're making, but :

TRANSFORM (RVA_HOLDER , (ATTRS_TO_KEEP ... , REPLACING_RVA ( RENAME ( RVA_HELD, (RVA_INNER_ATTR_OLD_NAME , RVA_INNER_ATTR_NEW_NAME) ) ) ) )

The idea is to "replace" the RVA as it occurs with the "conflicting" names, with an RVA that is a rename of the one which has the conflicts.  Tutorial D does not really have anything like my TRANSFORM shorthand, but it's achievable with EXTEND (with the RVA that is the rename of the existing one)/PROJECT (the existing one away).

Ah ...  You thought of that in a later post.

12