Tuples FTW

#1 · April 28, 2021, 6:47 am

On the subject of type system for a language capable of hosting a D (and also for Tailspin, of course), we have observed that Tuples must be structurally typed, i.e. the attributes they contain define them as the product type of those attributes.

As a counterpoint to a previous thread here, I propose that Tuples be THE way to create product types.

The latest insight (or train-wreck) that I had, is that we should let attributes define types, i.e. instead of saying that an attribute has a type, we say that an attribute is a type. I think this fits very nicely with the natural join and that we take the position that things with the same name are the same kind of things. It also fits in with a good practice to create specific types for specific things, even if in Java it is a bit of a pain to e.g. create a SupplierName class that simply wraps a String.

So we would declare that there is a type called PNAME of the base type string, and the type called SNAME of the base type string, and you just use them as attributes in the Tuples, the type and the attribute have the same name.

Obviously you cannot assign an SNAME value to a PNAME attribute without casting it. But you could e.g. have a COMPANY_NAME and have SNAME be of the type COMPANY_NAME, which would enable assigning between the two.

So, comments? Good idea? Insane idea?

#2 · April 28, 2021, 8:36 am

Quote from Erwin on April 28, 2021, 8:36 am

Quote from tobega on April 28, 2021, 6:47 am

On the subject of type system for a language capable of hosting a D (and also for Tailspin, of course), we have observed that Tuples must be structurally typed, i.e. the attributes they contain define them as the product type of those attributes.

As a counterpoint to a previous thread here, I propose that Tuples be THE way to create product types.

The latest insight (or train-wreck) that I had, is that we should let attributes define types, i.e. instead of saying that an attribute has a type, we say that an attribute is a type. I think this fits very nicely with the natural join and that we take the position that things with the same name are the same kind of things. It also fits in with a good practice to create specific types for specific things, even if in Java it is a bit of a pain to e.g. create a SupplierName class that simply wraps a String.

So we would declare that there is a type called PNAME of the base type string, and the type called SNAME of the base type string, and you just use them as attributes in the Tuples, the type and the attribute have the same name.

Obviously you cannot assign an SNAME value to a PNAME attribute without casting it. But you could e.g. have a COMPANY_NAME and have SNAME be of the type COMPANY_NAME, which would enable assigning between the two.

So, comments? Good idea? Insane idea?

Types are what they are and are used in TTM how they are because of how they create the connection with logic : a type is the domain the free variables in the corresponding predicate draw their values from to "generate" the corresponding proposition. An attribute declaration inside a nonscalar type definition has TWO parts : the attribute name (Codd mentioned them as "role names") to establish user-friendly addressability of the attribute value, and the type name to establish the logic domain.

So an attribute declaration is an (attrnm, typenm) pair. What does it look like in your proposal ? I cannot tell because you use the same words for both attribute name and type name so I cannot tell from the example. Is it simpler ? Seems questionable because "simpler" must mean ditching one of those two names which must mean you lose either user-friendly addressability or the very link with the value set (logic domain) itself. Or is it merely notionally equivalent ?

An attribute reference is just a mention of the attribute name. Can your proposal make referencing attributes any simpler ?

I do note you seem to mention "two types of types" : your "new-style" types and "base types". Is that going to make things simpler ? I doubt it.

Quote from tobega on April 28, 2021, 6:47 am

On the subject of type system for a language capable of hosting a D (and also for Tailspin, of course), we have observed that Tuples must be structurally typed, i.e. the attributes they contain define them as the product type of those attributes.

As a counterpoint to a previous thread here, I propose that Tuples be THE way to create product types.

The latest insight (or train-wreck) that I had, is that we should let attributes define types, i.e. instead of saying that an attribute has a type, we say that an attribute is a type. I think this fits very nicely with the natural join and that we take the position that things with the same name are the same kind of things. It also fits in with a good practice to create specific types for specific things, even if in Java it is a bit of a pain to e.g. create a SupplierName class that simply wraps a String.

So we would declare that there is a type called PNAME of the base type string, and the type called SNAME of the base type string, and you just use them as attributes in the Tuples, the type and the attribute have the same name.

Obviously you cannot assign an SNAME value to a PNAME attribute without casting it. But you could e.g. have a COMPANY_NAME and have SNAME be of the type COMPANY_NAME, which would enable assigning between the two.

So, comments? Good idea? Insane idea?

Types are what they are and are used in TTM how they are because of how they create the connection with logic : a type is the domain the free variables in the corresponding predicate draw their values from to "generate" the corresponding proposition. An attribute declaration inside a nonscalar type definition has TWO parts : the attribute name (Codd mentioned them as "role names") to establish user-friendly addressability of the attribute value, and the type name to establish the logic domain.

So an attribute declaration is an (attrnm, typenm) pair. What does it look like in your proposal ? I cannot tell because you use the same words for both attribute name and type name so I cannot tell from the example. Is it simpler ? Seems questionable because "simpler" must mean ditching one of those two names which must mean you lose either user-friendly addressability or the very link with the value set (logic domain) itself. Or is it merely notionally equivalent ?

An attribute reference is just a mention of the attribute name. Can your proposal make referencing attributes any simpler ?

I do note you seem to mention "two types of types" : your "new-style" types and "base types". Is that going to make things simpler ? I doubt it.

Author of SIRA_PRISE

#3 · April 28, 2021, 8:51 am

Quote from tobega on April 28, 2021, 6:47 am

we should let attributes define types, i.e. instead of saying that an attribute has a type, we say that an attribute is a type.

If an attribute IS a type, then how do you handle cases where a tuple needs to have multiple attributes of the same type? For example, a Marriage tuple referring to 2 Person?

#4 · April 28, 2021, 8:55 am

Quote from tobega on April 28, 2021, 6:47 am

It also fits in with a good practice to create specific types for specific things, even if in Java it is a bit of a pain to e.g. create a SupplierName class that simply wraps a String.

I actually advocate in normal programming to have such very wrapper types. Have lots of types that have very specific meanings and typically wrap generic types like String or Integer, and the wrappers convey more meaning.

There is a common term called "primitive obsession" which describes people who insist on using plain String/int/etc rather than the more semantically strict types described above.

#5 · April 28, 2021, 9:07 am

Quote from Darren Duncan on April 28, 2021, 8:55 am

Quote from tobega on April 28, 2021, 6:47 am

It also fits in with a good practice to create specific types for specific things, even if in Java it is a bit of a pain to e.g. create a SupplierName class that simply wraps a String.

I actually advocate in normal programming to have such very wrapper types. Have lots of types that have very specific meanings and typically wrap generic types like String or Integer, and the wrappers convey more meaning.

There is a common term called "primitive obsession" which describes people who insist on using plain String/int/etc rather than the more semantically strict types described above.

Agree.

Some years ago, I showed a Rel demo database to a DBA in which I had wrapped all primitive types in user-defined types. I demonstrated how it prevented you from JOINing customer numbers to phone numbers, or multiplying phone numbers by product quantities, etc.

He thought it was marvellous. Apparently a significant source of error and time-consuming development and fixes was due to unintended JOINs and operations on what should be incompatible types, particularly when working with large and/or unfamiliar schemas.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org

#6 · April 28, 2021, 10:11 am

Quote from Darren Duncan on April 28, 2021, 8:55 am

Quote from tobega on April 28, 2021, 6:47 am

It also fits in with a good practice to create specific types for specific things, even if in Java it is a bit of a pain to e.g. create a SupplierName class that simply wraps a String.

I actually advocate in normal programming to have such very wrapper types. Have lots of types that have very specific meanings and typically wrap generic types like String or Integer, and the wrappers convey more meaning.

There is a common term called "primitive obsession" which describes people who insist on using plain String/int/etc rather than the more semantically strict types described above.

How would you do that in a language like Java? It was a feature we used a lot in C (typedefs) but the closest equivalent in Java is seriously clunky.

Andl - A New Database Language - andl.org

#7 · April 28, 2021, 10:36 am

Quote from tobega on April 28, 2021, 6:47 am

On the subject of type system for a language capable of hosting a D (and also for Tailspin, of course), we have observed that Tuples must be structurally typed, i.e. the attributes they contain define them as the product type of those attributes.

As a counterpoint to a previous thread here, I propose that Tuples be THE way to create product types.

The latest insight (or train-wreck) that I had, is that we should let attributes define types, i.e. instead of saying that an attribute has a type, we say that an attribute is a type. I think this fits very nicely with the natural join and that we take the position that things with the same name are the same kind of things. It also fits in with a good practice to create specific types for specific things, even if in Java it is a bit of a pain to e.g. create a SupplierName class that simply wraps a String.

So we would declare that there is a type called PNAME of the base type string, and the type called SNAME of the base type string, and you just use them as attributes in the Tuples, the type and the attribute have the same name.

Obviously you cannot assign an SNAME value to a PNAME attribute without casting it. But you could e.g. have a COMPANY_NAME and have SNAME be of the type COMPANY_NAME, which would enable assigning between the two.

So, comments? Good idea? Insane idea?

I've seen other replies. It doesn't look like a good idea to me but in any case clarification is needed. Please give examples of type definitions for, e.g., SNAME and PNAME, preferably using TD-like syntax. I assume you imagine a relation type definition to be like TD's but with just attribute type names as heading components: REL{SNO, SNAME, CITY

What do you think a value of an attribute type looks like. Please give a literal denoting the supplier name Smith.

What are the implications for the relational RENAME operator?

Hugh

Coauthor of The Third Manifesto and related books.

#8 · April 28, 2021, 10:37 am

Quote from dandl on April 28, 2021, 10:11 am

Quote from Darren Duncan on April 28, 2021, 8:55 am

Quote from tobega on April 28, 2021, 6:47 am

It also fits in with a good practice to create specific types for specific things, even if in Java it is a bit of a pain to e.g. create a SupplierName class that simply wraps a String.

I actually advocate in normal programming to have such very wrapper types. Have lots of types that have very specific meanings and typically wrap generic types like String or Integer, and the wrappers convey more meaning.

There is a common term called "primitive obsession" which describes people who insist on using plain String/int/etc rather than the more semantically strict types described above.

How would you do that in a language like Java? It was a feature we used a lot in C (typedefs) but the closest equivalent in Java is seriously clunky.

In Java (and C# too) it is a bit clunky, but only at the point of defining a 'data dictionary', so to speak, of domain-specific types that wrap (via composition) primitives. Once you've created them, they're straightforward to use. Of course, programming in Java and C# tends to already be inclined to create classes, leaving primitive types to appropriately primitive purposes.

The "primitive obsession" I've seen is where, say, multiple arrays of primitive types are created intentionally and specifically to avoid creating one class composed of primitive types and multiple instances of that class.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org

#9 · April 28, 2021, 11:50 am

Quote from AntC on April 28, 2021, 11:50 am

Quote from tobega on April 28, 2021, 6:47 am

On the subject of type system for a language capable of hosting a D (and also for Tailspin, of course), we have observed that Tuples must be structurally typed, i.e. the attributes they contain define them as the product type of those attributes.

As a counterpoint to a previous thread here, I propose that Tuples be THE way to create product types.

Careful: in most languages, 'product type' means an ordered product. Type (Int, Bool) is distinct from (Bool, Int). So do you mean TUPLE{ SNAME 'Acme', PNAME 'Grommet'} is type distinct from TUPLE{ PNAME 'Grommet', SNAME 'Acme'}?

The latest insight (or train-wreck) that I had, is that we should let attributes define types, i.e. instead of saying that an attribute has a type, we say that an attribute is a type. I think this fits very nicely with the natural join and that we take the position that things with the same name are the same kind of things. It also fits in with a good practice to create specific types for specific things, even if in Java it is a bit of a pain to e.g. create a SupplierName class that simply wraps a String.

So we would declare that there is a type called PNAME of the base type string, and the type called SNAME of the base type string, and you just use them as attributes in the Tuples, the type and the attribute have the same name.

No wrong. Because we have to cope with ad-hoc attribute naming: type X Int distinct from X String. Furthermore we don't want bare String or Int being allowed as types of attributes. We always want there to be a wrapper; and the wrapper to wrap a single 'payload' type.

Take a look at Haskell (or most Functional Languages') 'datatype renamings' section 4.2.3.

newtype X a = X a -- where a (parametric) denotes some arbitrary type

In a nominal typing system, type name X Int is distinct from type Int. The newtype construct says they are to share the same PhysRep.

Obviously you cannot assign an SNAME value to a PNAME attribute without casting it.

Sure, because type SNAME String is distinct from PNAME String. But 'casting' is not the appropriate mechanism here: unwrap the string from one then re-wrap it into the other. In Functional languages that's achieved via 'pattern matching'. And because the compiler knows they're newtypes and therefore share the same PhysRep, that's a no-op. Tutorial D has SNAME FROM ... -- in which presumably unwrap/rewrap is computationally more clunky.

But you could e.g. have a COMPANY_NAME and have SNAME be of the type COMPANY_NAME, which would enable assigning between the two.

No this is nominal typing: if two types are different named, they are different types, you can't directly assign between them. Perhaps you mean SNAME String is an alias aka shorthand for COMPANY_NAME String (or vice versa)? (See section 4.2.2, type decl.) [Note **]

So, comments? Good idea? Insane idea?

I've already built a D-alike extension to Haskell using this idea. But Haskell treats its tuples positionally. So it needed ugly generics to treat these two tuples as being under a type-equivalence relationship. (Note I didn't say 'same type'.)
sp = tuple_union (PName 'Grommet', SName 'Acme', Qty 50) (SName 'Jones', Qty 100, PName 'Grommet')
In which tuple_union is a function that takes two (positional) Haskell tuples, and returns (an equivalent of) a TTM relation value.

Note ** You could use the type alias idea like this:
type COMPANY_NAME = SName String
That declares a single lexeme COMPANY_NAME as shorthand for a wrapped type (two lexemes). And everywhere your code uses the single lexeme, it's immediately expanded to the two-lexeme form.

Quote from tobega on April 28, 2021, 6:47 am

On the subject of type system for a language capable of hosting a D (and also for Tailspin, of course), we have observed that Tuples must be structurally typed, i.e. the attributes they contain define them as the product type of those attributes.

As a counterpoint to a previous thread here, I propose that Tuples be THE way to create product types.

Careful: in most languages, 'product type' means an ordered product. Type (Int, Bool) is distinct from (Bool, Int). So do you mean TUPLE{ SNAME 'Acme', PNAME 'Grommet'} is type distinct from TUPLE{ PNAME 'Grommet', SNAME 'Acme'}?

The latest insight (or train-wreck) that I had, is that we should let attributes define types, i.e. instead of saying that an attribute has a type, we say that an attribute is a type. I think this fits very nicely with the natural join and that we take the position that things with the same name are the same kind of things. It also fits in with a good practice to create specific types for specific things, even if in Java it is a bit of a pain to e.g. create a SupplierName class that simply wraps a String.

So we would declare that there is a type called PNAME of the base type string, and the type called SNAME of the base type string, and you just use them as attributes in the Tuples, the type and the attribute have the same name.

No wrong. Because we have to cope with ad-hoc attribute naming: type X Int distinct from X String. Furthermore we don't want bare String or Int being allowed as types of attributes. We always want there to be a wrapper; and the wrapper to wrap a single 'payload' type.

Take a look at Haskell (or most Functional Languages') 'datatype renamings' section 4.2.3.

newtype X a = X a -- where a (parametric) denotes some arbitrary type

In a nominal typing system, type name X Int is distinct from type Int. The newtype construct says they are to share the same PhysRep.

Obviously you cannot assign an SNAME value to a PNAME attribute without casting it.

Sure, because type SNAME String is distinct from PNAME String. But 'casting' is not the appropriate mechanism here: unwrap the string from one then re-wrap it into the other. In Functional languages that's achieved via 'pattern matching'. And because the compiler knows they're newtypes and therefore share the same PhysRep, that's a no-op. Tutorial D has SNAME FROM ... -- in which presumably unwrap/rewrap is computationally more clunky.

But you could e.g. have a COMPANY_NAME and have SNAME be of the type COMPANY_NAME, which would enable assigning between the two.

No this is nominal typing: if two types are different named, they are different types, you can't directly assign between them. Perhaps you mean SNAME String is an alias aka shorthand for COMPANY_NAME String (or vice versa)? (See section 4.2.2, type decl.) [Note **]

So, comments? Good idea? Insane idea?

I've already built a D-alike extension to Haskell using this idea. But Haskell treats its tuples positionally. So it needed ugly generics to treat these two tuples as being under a type-equivalence relationship. (Note I didn't say 'same type'.)

sp = tuple_union (PName 'Grommet', SName 'Acme', Qty 50) (SName 'Jones', Qty 100, PName 'Grommet')

In which tuple_union is a function that takes two (positional) Haskell tuples, and returns (an equivalent of) a TTM relation value.

Note ** You could use the type alias idea like this:

type COMPANY_NAME = SName String

That declares a single lexeme COMPANY_NAME as shorthand for a wrapped type (two lexemes). And everywhere your code uses the single lexeme, it's immediately expanded to the two-lexeme form.

#10 · April 28, 2021, 11:58 am

Quote from Dave Voorhis on April 28, 2021, 9:07 am

Quote from Darren Duncan on April 28, 2021, 8:55 am

Quote from tobega on April 28, 2021, 6:47 am

It also fits in with a good practice to create specific types for specific things, even if in Java it is a bit of a pain to e.g. create a SupplierName class that simply wraps a String.

I actually advocate in normal programming to have such very wrapper types. Have lots of types that have very specific meanings and typically wrap generic types like String or Integer, and the wrappers convey more meaning.

There is a common term called "primitive obsession" which describes people who insist on using plain String/int/etc rather than the more semantically strict types described above.

Agree.

Some years ago, I showed a Rel demo database to a DBA in which I had wrapped all primitive types in user-defined types. I demonstrated how it prevented you from JOINing customer numbers to phone numbers, or multiplying phone numbers by product quantities, etc.

He thought it was marvellous. Apparently a significant source of error and time-consuming development and fixes was due to unintended JOINs and operations on what should be incompatible types, particularly when working with large and/or unfamiliar schemas.

Indeed. This is the source of the nostrum 'Natural Join is a disaster waiting to happen.' (They mean in SQL.) If by accident you have column name User or Date on two different tables, Natural Join will join by them even though one is Entered-by User and the other is Approved-by User, or Order-Date vs Delivered-Date.

I favour declaring a data dictionary before declaring any tables, and such that attribute names on tables must be drawn from the dictionary. And that no dictionary field be named Date, User, Int, String, Count, Balance, Total, etc.

The Forum for Discussion about The Third Manifesto and Related Matters

Tuples FTW