The Forum for Discussion about The Third Manifesto and Related Matters

Please or Register to create posts and topics.

VSS 8, SQL arrays, and SQL multisets

This is probably obvious to everyone but me, but it seems to me that when emulating SQL on a D database, there is no real difficulty with storing arrays or multisets: represent an array as a 2-attribute RVA with columns INDEX and VALUE, and a multiset likewise as a 2-attribute RVA with columns SYNTHETIC_KEY and VALUE.  Both of these are end runs around 1NF, and that's what RVAs are too.

That is one approach, agreed, but there are issues. It involves taking a single value, wrapping it in a tuple, inventing names for the index and value, and then reversing that process to retrieve the required value. You get none of the benefits of the relational form, because you will probably never want to join, project, restrict, etc on such a relation.And the code tends to be clunky for various common use cases. But it may be the path of least resistance if the language provides little help, as in Tutorial D.

Another way is to treat an array or hashset as a tuple, where the attribute name is the index. This is very similar to the approach used in languages such as JavaScript, Python and Ruby. It does require some language extensions to make it work nicely.

But the easiest way is to treat an array or hashset as a function (operator) of one (or more) arguments, thus placing it in the type system rather than in its own relvar. The implementation of the operator can decide where and how to persist the data values.

Yes, you can treat multisets as if they were arrays with a dummy key, but that's neither the only way or even the best way most of the time. It's usually best to represent them via an iterator, which again means they're best placed in the type system, not directly in their own relvar.

Andl - A New Database Language - andl.org

I'm discussing arrays and multisets only in connection with VSS 8, which is about providing a SQL overlay on a D database.  The commentary on VSS8 says about them:

ARRAY Types

The Third Manifesto does not prohibit D from including support for “collection” type generators, so D could legitimately support an ARRAY type generator, and we would certainly expect such support to include everything needed in connection with SQL’s array types. (Of course, if such an ARRAY type generator were indeed supported, it would presumably replace or subsume the limited array support described in Chapter 5. Note, however, that it would still be the ca se that array variables would not be allowed in the database, thanks to The Information Principle. See RM Prescription 16.)

MULTISET Types

D could support a MULTISET type generator as a direct counterpart to SQL:2003’s new MULTISET type constructor (though with the proviso that multiset variables would not be allowed in the database).

I'm arguing that RVAs provide everything you need to allow ARRAY and MULTISET SQL types in the database as well; no new D types are required.

Quote from johnwcowan on October 15, 2019, 12:45 pm

I'm discussing arrays and multisets only in connection with VSS 8, which is about providing a SQL overlay on a D database.  The commentary on VSS8 says about them:

ARRAY Types

The Third Manifesto does not prohibit D from including support for “collection” type generators, so D could legitimately support an ARRAY type generator, and we would certainly expect such support to include everything needed in connection with SQL’s array types. (Of course, if such an ARRAY type generator were indeed supported, it would presumably replace or subsume the limited array support described in Chapter 5. Note, however, that it would still be the ca se that array variables would not be allowed in the database, thanks to The Information Principle. See RM Prescription 16.)

MULTISET Types

D could support a MULTISET type generator as a direct counterpart to SQL:2003’s new MULTISET type constructor (though with the proviso that multiset variables would not be allowed in the database).

I'm arguing that RVAs provide everything you need to allow ARRAY and MULTISET SQL types in the database as well; no new D types are required.

An RVA is of a relation type and its values are therefore relations.  You can devise a relational representation for multisets, pairing each tuple with the number of times it occurs (or, less pleasingly, with your suggested system-generated key value), but they won't then correspond to SQL multiset types and you won't have an implementation of VSS8.  A similar comment applies to array types, mutatis mutandis.

Hugh

Coauthor of The Third Manifesto and related books.
Quote from johnwcowan on October 15, 2019, 12:45 pm

I'm discussing arrays and multisets only in connection with VSS 8, which is about providing a SQL overlay on a D database.  The commentary on VSS8 says about them:

ARRAY Types

The Third Manifesto does not prohibit D from including support for “collection” type generators, so D could legitimately support an ARRAY type generator, and we would certainly expect such support to include everything needed in connection with SQL’s array types. (Of course, if such an ARRAY type generator were indeed supported, it would presumably replace or subsume the limited array support described in Chapter 5. Note, however, that it would still be the ca se that array variables would not be allowed in the database, thanks to The Information Principle. See RM Prescription 16.)

MULTISET Types

D could support a MULTISET type generator as a direct counterpart to SQL:2003’s new MULTISET type constructor (though with the proviso that multiset variables would not be allowed in the database).

I'm arguing that RVAs provide everything you need to allow ARRAY and MULTISET SQL types in the database as well; no new D types are required.

And I'm arguing that if you are absolutely desperate and your D of choice has nothing better you can kind of get by in most situations by faking it using relations and a naming convention.

But given the choice no-one would ever want to. Genuine arrays, hashes (aka dictionaries or content addressable arrays) and multisets are far better than these weak substitutes.

 

 

Andl - A New Database Language - andl.org

If you have primitive arrays, you can create all other data structures on top of them with some ingenuity.  If you have RVAs, you don't actually need primitive arrays.