The Forum for Discussion about The Third Manifesto and Related Matters

Please or Register to create posts and topics.

Tuples FTW

PreviousPage 4 of 5Next
Quote from tobega on April 29, 2021, 5:22 pm
Quote from Hugh on April 29, 2021, 3:06 pm
Quote from tobega on April 28, 2021, 3:25 pm
Quote from Hugh on April 28, 2021, 2:32 pm
Quote from Hugh on April 28, 2021, 10:36 am
Quote from tobega on April 28, 2021, 6:47 am

On the subject of type system for a language capable of hosting a D (and also for Tailspin, of course), we have observed that Tuples must be structurally typed, i.e. the attributes they contain define them as the product type of those attributes.

As a counterpoint to a previous thread here, I propose that Tuples be THE way to create product types.

The latest insight (or train-wreck) that I had, is that we should let attributes define types, i.e. instead of saying that an attribute has a type, we say that an attribute is a type. I think this fits very nicely with the natural join and that we take the position that things with the same name are the same kind of things. It also fits in with a good practice to create specific types for specific things, even if in Java it is a bit of a pain to e.g. create a SupplierName class that simply wraps a String.

So we would declare that there is a type called PNAME of the base type string, and the type called SNAME of the base type string, and you just use them as attributes in the Tuples, the type and the attribute have the same name.

Obviously you cannot assign an SNAME value to a PNAME attribute without casting it. But you could e.g. have a COMPANY_NAME and have SNAME be of the type COMPANY_NAME, which would enable assigning between the two.

So, comments? Good idea? Insane idea?

I've seen other replies.  It doesn't look like a good idea to me but in any case clarification is needed.  Please give examples of type definitions for, e.g., SNAME and PNAME, preferably using TD-like syntax.  I assume you imagine a relation type definition to be like TD's but with just attribute type names as heading components: REL{SNO, SNAME, CITY

What do you think a value of an attribute type looks like.   Please give a literal denoting the supplier name Smith.

What are the implications for the relational RENAME operator?

Hugh

P.S.  Perhaps more appropriate, what about EXTEND?  In particular, I have a query that involves extension with concatenation of FirstName and LastName (with a blank in between).  How is that done?

Answering once for both of your posts.

I wouldn't necessarily change anything from the TD syntax, if that is the syntax you want, so

TUPLE { S# S#, SNAME NAME, STATUS INTEGER, CITY CHAR }
TUPLE { S# S#('S1'), SNAME NAME('Smith'), STATUS 20, CITY 'London' }
TUPLE { S# S#, SNAME NAME, STATUS INTEGER, CITY CHAR } TUPLE { S# S#('S1'), SNAME NAME('Smith'), STATUS 20, CITY 'London' }
TUPLE { S# S#, SNAME NAME, STATUS INTEGER, CITY CHAR }

TUPLE { S# S#('S1'), SNAME NAME('Smith'), STATUS 20, CITY 'London' }

But what would happen is that we would now automatically also have the types SNAME, STATUS and CITY defined, and the "real" type of the CITY attribute would be CITY, but it could be assigned values of the representation type CHAR.

I believe I answered the RENAME case in my reply to Darren.

As for EXTEND, I suppose that would work similarly in that you would have to explicitly assert the conversion to appropriate types. So FirstName is what? CHAR? And LastName might also be CHAR. So lets look at TD syntax:

EXTEND a ADD (CHAR(FirstName)+' '+CHAR(LastName) AS FullName)
EXTEND a ADD (CHAR(FirstName)+' '+CHAR(LastName) AS FullName)
EXTEND a ADD (CHAR(FirstName)+' '+CHAR(LastName) AS FullName)

If FullName has not been previously defined, there will now be a type "FullName CHAR", and any attribute FullName can be assumed to be of type FullName, without needing specification. But it wouldn't necessarily hurt to respecify "FullName CHAR". However, trying to specify e.g. "FullName INT" would not be allowed anywhere, you would have to forego that option.

Oh, I didn't see this one until after I had responded to Dave Voorhis.  So I didn't misunderstand but you have now clarified.  The question now concerns the scope of this defined-on-the-fly type Fullname.  It has to be local to the expression in which it is defined, otherwise I would object strongly.  And it is is local to the expression, I can't see much point.

Hugh

There wouldn't be much point in having it just local to the expression. I propose that the specific type definitions for each attribute that Dave provided is a "best practice" or at least a "good practice", so we can just let them be automatically defined from the attribute definition. This would lead the developer in the right direction for the price of a slight inconvenience on the rare (?) occasion when you would have wanted an attribute with the same name but of a different type.

Thank you for confirming the pointlessness of a type definition local to the expression in which it is defined.  In that case, this particular aspect of your idea bothers me greatly.

First, it gets into the dodgy are of expression evaluation having side-effects.  That alone is sufficient grounds for rejection afaiacs.

Secondly,  what happens if  "exp1 AS FullName" and "exp2 AS FullName" specifications appear in the same overall relation expression, where exp1 andexp2 re of different types?

Thirdly, what happens if "exp2 AS FullName" appears in a subsequent statement (perhaps a year or so later, if you are really thinking of global scope)?

It is unthinkable, to my mind, that somebody innocently entering an ad hoc query might invalidate somebody else's ad hoc query in this way.

Sorry if points like this have already been raised and possibly addressed.  I know there has been a lot of correspondence but my brief glances have given me the impression that your idea has received some interest and even sympathy in some quarters.  So I might be missing something.

Hugh

Hugh

Coauthor of The Third Manifesto and related books.
Quote from Hugh on April 30, 2021, 1:12 pm
Quote from tobega on April 29, 2021, 5:22 pm
Quote from Hugh on April 29, 2021, 3:06 pm
Quote from tobega on April 28, 2021, 3:25 pm
Quote from Hugh on April 28, 2021, 2:32 pm
Quote from Hugh on April 28, 2021, 10:36 am
Quote from tobega on April 28, 2021, 6:47 am

On the subject of type system for a language capable of hosting a D (and also for Tailspin, of course), we have observed that Tuples must be structurally typed, i.e. the attributes they contain define them as the product type of those attributes.

As a counterpoint to a previous thread here, I propose that Tuples be THE way to create product types.

The latest insight (or train-wreck) that I had, is that we should let attributes define types, i.e. instead of saying that an attribute has a type, we say that an attribute is a type. I think this fits very nicely with the natural join and that we take the position that things with the same name are the same kind of things. It also fits in with a good practice to create specific types for specific things, even if in Java it is a bit of a pain to e.g. create a SupplierName class that simply wraps a String.

So we would declare that there is a type called PNAME of the base type string, and the type called SNAME of the base type string, and you just use them as attributes in the Tuples, the type and the attribute have the same name.

Obviously you cannot assign an SNAME value to a PNAME attribute without casting it. But you could e.g. have a COMPANY_NAME and have SNAME be of the type COMPANY_NAME, which would enable assigning between the two.

So, comments? Good idea? Insane idea?

I've seen other replies.  It doesn't look like a good idea to me but in any case clarification is needed.  Please give examples of type definitions for, e.g., SNAME and PNAME, preferably using TD-like syntax.  I assume you imagine a relation type definition to be like TD's but with just attribute type names as heading components: REL{SNO, SNAME, CITY

What do you think a value of an attribute type looks like.   Please give a literal denoting the supplier name Smith.

What are the implications for the relational RENAME operator?

Hugh

P.S.  Perhaps more appropriate, what about EXTEND?  In particular, I have a query that involves extension with concatenation of FirstName and LastName (with a blank in between).  How is that done?

Answering once for both of your posts.

I wouldn't necessarily change anything from the TD syntax, if that is the syntax you want, so

TUPLE { S# S#, SNAME NAME, STATUS INTEGER, CITY CHAR }
TUPLE { S# S#('S1'), SNAME NAME('Smith'), STATUS 20, CITY 'London' }
TUPLE { S# S#, SNAME NAME, STATUS INTEGER, CITY CHAR } TUPLE { S# S#('S1'), SNAME NAME('Smith'), STATUS 20, CITY 'London' }
TUPLE { S# S#, SNAME NAME, STATUS INTEGER, CITY CHAR }

TUPLE { S# S#('S1'), SNAME NAME('Smith'), STATUS 20, CITY 'London' }

But what would happen is that we would now automatically also have the types SNAME, STATUS and CITY defined, and the "real" type of the CITY attribute would be CITY, but it could be assigned values of the representation type CHAR.

I believe I answered the RENAME case in my reply to Darren.

As for EXTEND, I suppose that would work similarly in that you would have to explicitly assert the conversion to appropriate types. So FirstName is what? CHAR? And LastName might also be CHAR. So lets look at TD syntax:

EXTEND a ADD (CHAR(FirstName)+' '+CHAR(LastName) AS FullName)
EXTEND a ADD (CHAR(FirstName)+' '+CHAR(LastName) AS FullName)
EXTEND a ADD (CHAR(FirstName)+' '+CHAR(LastName) AS FullName)

If FullName has not been previously defined, there will now be a type "FullName CHAR", and any attribute FullName can be assumed to be of type FullName, without needing specification. But it wouldn't necessarily hurt to respecify "FullName CHAR". However, trying to specify e.g. "FullName INT" would not be allowed anywhere, you would have to forego that option.

Oh, I didn't see this one until after I had responded to Dave Voorhis.  So I didn't misunderstand but you have now clarified.  The question now concerns the scope of this defined-on-the-fly type Fullname.  It has to be local to the expression in which it is defined, otherwise I would object strongly.  And it is is local to the expression, I can't see much point.

Hugh

There wouldn't be much point in having it just local to the expression. I propose that the specific type definitions for each attribute that Dave provided is a "best practice" or at least a "good practice", so we can just let them be automatically defined from the attribute definition. This would lead the developer in the right direction for the price of a slight inconvenience on the rare (?) occasion when you would have wanted an attribute with the same name but of a different type.

Thank you for confirming the pointlessness of a type definition local to the expression in which it is defined.  In that case, this particular aspect of your idea bothers me greatly.

First, it gets into the dodgy are of expression evaluation having side-effects.  That alone is sufficient grounds for rejection afaiacs.

Secondly,  what happens if  "exp1 AS FullName" and "exp2 AS FullName" specifications appear in the same overall relation expression, where exp1 andexp2 re of different types?

Thirdly, what happens if "exp2 AS FullName" appears in a subsequent statement (perhaps a year or so later, if you are really thinking of global scope)?

It is unthinkable, to my mind, that somebody innocently entering an ad hoc query might invalidate somebody else's ad hoc query in this way.

Sorry if points like this have already been raised and possibly addressed.  I know there has been a lot of correspondence but my brief glances have given me the impression that your idea has received some interest and even sympathy in some quarters.  So I might be missing something.

Hugh

Hugh

Would something like this be more palatable?

DICTIONARY;
  Customer_ID CHAR;
  Customer_Phone INT;
  Invoice_Number INT;
  Invoice_Date Date;
  Amount RATIONAL;
END DICTIONARY;

VAR Customers REAL RELATION {Customer_ID, Customer_Phone} KEY {Customer_ID};
VAR Invoices REAL RELATION {Invoice_Number, Invoice_Date, Customer_ID, Amount} KEY {Invoice_Number};

CONSTRAINT Invoices_Customers_FK  Invoices {Customer_ID} ⊆ Customers {Customer_ID};

The effect of the DICTIONARY block is to define a data dictionary of attributes, which implicitly creates the following types...

TYPE Customer_ID POSSREP {Value CHAR};
TYPE Customer_Phone POSSREP {Value INT}; 
TYPE Invoice_Number POSSREP {Value INT}; 
TYPE Invoice_Date POSSREP {Value Date}; 
TYPE Amount POSSREP {Value RATIONAL};

...and allows declarations like VAR Invoices REAL RELATION {Invoice_Number, Invoice_Date, Customer_ID, Amount} KEY {Invoice_Number}; to be shorthand for:

VAR Invoices REAL RELATION {Invoice_Number Invoice_Number, Invoice_Date Invoice_Date, Customer_ID Customer_ID, Amount Amount} KEY {Invoice_Number};

This allows the compiler to check that we're not going to accidently multiply a Customer_Phone by an Invoice_Number, etc.

Ideally, any declaration of the form <identifier> <type> should be replaceable with <dictionary_name>, e.g. this...

VAR Invoice_Number INIT(33);

...is the same as this:

VAR Invoice_Number Invoice_Number INIT(33);

I imagine it would an entirely optional feature; no existing Tutorial D code would be broken by adding the facility, and anyone who doesn't want to use DICTIONARY and the <dictionary_name> shorthands for <identifier> <type> could ignore them.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org

Thinking a bit more on 'DICTIONARY', etc., maybe it would be desirable to be able to do this:

DICTIONARY;
  Customer_ID CHAR;
  Customer_Phone, Customer_Phone2 INT;
  Invoice_Number INT;
  Invoice_Date Date;
  Amount RATIONAL;
END DICTIONARY;

So you can say this...

VAR Customers REAL RELATION {Customer_ID, Customer_Phone, Customer_Phone2} KEY {Customer_ID};

...which is shorthand for:

VAR Customers REAL RELATION {Customer_ID Customer_ID, Customer_Phone Customer_Phone, Customer_Phone2 Customer_Phone} KEY {Customer_ID};

 

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
Quote from Dave Voorhis on April 30, 2021, 1:46 pm
Quote from Hugh on April 30, 2021, 1:12 pm
Quote from tobega on April 29, 2021, 5:22 pm
Quote from Hugh on April 29, 2021, 3:06 pm
Quote from tobega on April 28, 2021, 3:25 pm
Quote from Hugh on April 28, 2021, 2:32 pm
Quote from Hugh on April 28, 2021, 10:36 am
Quote from tobega on April 28, 2021, 6:47 am

On the subject of type system for a language capable of hosting a D (and also for Tailspin, of course), we have observed that Tuples must be structurally typed, i.e. the attributes they contain define them as the product type of those attributes.

As a counterpoint to a previous thread here, I propose that Tuples be THE way to create product types.

The latest insight (or train-wreck) that I had, is that we should let attributes define types, i.e. instead of saying that an attribute has a type, we say that an attribute is a type. I think this fits very nicely with the natural join and that we take the position that things with the same name are the same kind of things. It also fits in with a good practice to create specific types for specific things, even if in Java it is a bit of a pain to e.g. create a SupplierName class that simply wraps a String.

So we would declare that there is a type called PNAME of the base type string, and the type called SNAME of the base type string, and you just use them as attributes in the Tuples, the type and the attribute have the same name.

Obviously you cannot assign an SNAME value to a PNAME attribute without casting it. But you could e.g. have a COMPANY_NAME and have SNAME be of the type COMPANY_NAME, which would enable assigning between the two.

So, comments? Good idea? Insane idea?

I've seen other replies.  It doesn't look like a good idea to me but in any case clarification is needed.  Please give examples of type definitions for, e.g., SNAME and PNAME, preferably using TD-like syntax.  I assume you imagine a relation type definition to be like TD's but with just attribute type names as heading components: REL{SNO, SNAME, CITY

What do you think a value of an attribute type looks like.   Please give a literal denoting the supplier name Smith.

What are the implications for the relational RENAME operator?

Hugh

P.S.  Perhaps more appropriate, what about EXTEND?  In particular, I have a query that involves extension with concatenation of FirstName and LastName (with a blank in between).  How is that done?

Answering once for both of your posts.

I wouldn't necessarily change anything from the TD syntax, if that is the syntax you want, so

TUPLE { S# S#, SNAME NAME, STATUS INTEGER, CITY CHAR }
TUPLE { S# S#('S1'), SNAME NAME('Smith'), STATUS 20, CITY 'London' }
TUPLE { S# S#, SNAME NAME, STATUS INTEGER, CITY CHAR } TUPLE { S# S#('S1'), SNAME NAME('Smith'), STATUS 20, CITY 'London' }
TUPLE { S# S#, SNAME NAME, STATUS INTEGER, CITY CHAR }

TUPLE { S# S#('S1'), SNAME NAME('Smith'), STATUS 20, CITY 'London' }

But what would happen is that we would now automatically also have the types SNAME, STATUS and CITY defined, and the "real" type of the CITY attribute would be CITY, but it could be assigned values of the representation type CHAR.

I believe I answered the RENAME case in my reply to Darren.

As for EXTEND, I suppose that would work similarly in that you would have to explicitly assert the conversion to appropriate types. So FirstName is what? CHAR? And LastName might also be CHAR. So lets look at TD syntax:

EXTEND a ADD (CHAR(FirstName)+' '+CHAR(LastName) AS FullName)
EXTEND a ADD (CHAR(FirstName)+' '+CHAR(LastName) AS FullName)
EXTEND a ADD (CHAR(FirstName)+' '+CHAR(LastName) AS FullName)

If FullName has not been previously defined, there will now be a type "FullName CHAR", and any attribute FullName can be assumed to be of type FullName, without needing specification. But it wouldn't necessarily hurt to respecify "FullName CHAR". However, trying to specify e.g. "FullName INT" would not be allowed anywhere, you would have to forego that option.

Oh, I didn't see this one until after I had responded to Dave Voorhis.  So I didn't misunderstand but you have now clarified.  The question now concerns the scope of this defined-on-the-fly type Fullname.  It has to be local to the expression in which it is defined, otherwise I would object strongly.  And it is is local to the expression, I can't see much point.

Hugh

There wouldn't be much point in having it just local to the expression. I propose that the specific type definitions for each attribute that Dave provided is a "best practice" or at least a "good practice", so we can just let them be automatically defined from the attribute definition. This would lead the developer in the right direction for the price of a slight inconvenience on the rare (?) occasion when you would have wanted an attribute with the same name but of a different type.

Thank you for confirming the pointlessness of a type definition local to the expression in which it is defined.  In that case, this particular aspect of your idea bothers me greatly.

First, it gets into the dodgy are of expression evaluation having side-effects.  That alone is sufficient grounds for rejection afaiacs.

Secondly,  what happens if  "exp1 AS FullName" and "exp2 AS FullName" specifications appear in the same overall relation expression, where exp1 andexp2 re of different types?

Thirdly, what happens if "exp2 AS FullName" appears in a subsequent statement (perhaps a year or so later, if you are really thinking of global scope)?

It is unthinkable, to my mind, that somebody innocently entering an ad hoc query might invalidate somebody else's ad hoc query in this way.

Sorry if points like this have already been raised and possibly addressed.  I know there has been a lot of correspondence but my brief glances have given me the impression that your idea has received some interest and even sympathy in some quarters.  So I might be missing something.

Hugh

Hugh

Would something like this be more palatable?

DICTIONARY;
Customer_ID CHAR;
Customer_Phone INT;
Invoice_Number INT;
Invoice_Date Date;
Amount RATIONAL;
END DICTIONARY;
VAR Customers REAL RELATION {Customer_ID, Customer_Phone} KEY {Customer_ID};
VAR Invoices REAL RELATION {Invoice_Number, Invoice_Date, Customer_ID, Amount} KEY {Invoice_Number};
CONSTRAINT Invoices_Customers_FK Invoices {Customer_ID} ⊆ Customers {Customer_ID};
DICTIONARY; Customer_ID CHAR; Customer_Phone INT; Invoice_Number INT; Invoice_Date Date; Amount RATIONAL; END DICTIONARY; VAR Customers REAL RELATION {Customer_ID, Customer_Phone} KEY {Customer_ID}; VAR Invoices REAL RELATION {Invoice_Number, Invoice_Date, Customer_ID, Amount} KEY {Invoice_Number}; CONSTRAINT Invoices_Customers_FK Invoices {Customer_ID} ⊆ Customers {Customer_ID};
DICTIONARY;
  Customer_ID CHAR;
  Customer_Phone INT;
  Invoice_Number INT;
  Invoice_Date Date;
  Amount RATIONAL;
END DICTIONARY;

VAR Customers REAL RELATION {Customer_ID, Customer_Phone} KEY {Customer_ID};
VAR Invoices REAL RELATION {Invoice_Number, Invoice_Date, Customer_ID, Amount} KEY {Invoice_Number};

CONSTRAINT Invoices_Customers_FK  Invoices {Customer_ID} ⊆ Customers {Customer_ID};

The effect of the DICTIONARY block is to define a data dictionary of attributes, which implicitly creates the following types...

TYPE Customer_ID POSSREP {Value CHAR};
TYPE Customer_Phone POSSREP {Value INT};
TYPE Invoice_Number POSSREP {Value INT};
TYPE Invoice_Date POSSREP {Value Date};
TYPE Amount POSSREP {Value RATIONAL};
TYPE Customer_ID POSSREP {Value CHAR}; TYPE Customer_Phone POSSREP {Value INT}; TYPE Invoice_Number POSSREP {Value INT}; TYPE Invoice_Date POSSREP {Value Date}; TYPE Amount POSSREP {Value RATIONAL};
TYPE Customer_ID POSSREP {Value CHAR};
TYPE Customer_Phone POSSREP {Value INT}; 
TYPE Invoice_Number POSSREP {Value INT}; 
TYPE Invoice_Date POSSREP {Value Date}; 
TYPE Amount POSSREP {Value RATIONAL};

...and allows declarations like

VAR Invoices REAL RELATION {Invoice_Number, Invoice_Date, Customer_ID, Amount} KEY {Invoice_Number};

VAR Invoices REAL RELATION {Invoice_Number, Invoice_Date, Customer_ID, Amount} KEY {Invoice_Number}; to be shorthand for:

VAR Invoices REAL RELATION {Invoice_Number Invoice_Number, Invoice_Date Invoice_Date, Customer_ID Customer_ID, Amount Amount} KEY {Invoice_Number};
VAR Invoices REAL RELATION {Invoice_Number Invoice_Number, Invoice_Date Invoice_Date, Customer_ID Customer_ID, Amount Amount} KEY {Invoice_Number};
VAR Invoices REAL RELATION {Invoice_Number Invoice_Number, Invoice_Date Invoice_Date, Customer_ID Customer_ID, Amount Amount} KEY {Invoice_Number};

This allows the compiler to check that we're not going to accidently multiply a Customer_Phone by an Invoice_Number, etc.

Ideally, any declaration of the form <identifier> <type> should be replaceable with <dictionary_name>, e.g. this...

VAR Invoice_Number INIT(33);
VAR Invoice_Number INIT(33);
VAR Invoice_Number INIT(33);

...is the same as this:

VAR Invoice_Number Invoice_Number INIT(33);
VAR Invoice_Number Invoice_Number INIT(33);
VAR Invoice_Number Invoice_Number INIT(33);

I imagine it would an entirely optional feature; no existing Tutorial D code would be broken by adding the facility, and anyone who doesn't want to use DICTIONARY and the <dictionary_name> shorthands for <identifier> <type> could ignore them.

This is a bit too much to digest.  It would be better not to include all these shorthands to begin with (such as omitting the type on a VAR declaration to make it default to the variable name (or are we omitting the variable name and making it default to the type name?)).

Anyway, it doesn't seem anything like Tobega's idea.  Is it solving or addressing the same problem(s) as Tobega's?

I conclude from the examples that dictionary element names are independent from attributes of headings.  That's okay, but you do call a dictionary a dictionary of attributes whereas VAR Invoice_Number Invoice_Number INIT(33) shows Invoice_Number be used for something other than an attribute.

Also, it seems that I can assign and integer to an Invoice_Number variable, and I can compare an Invoice_Number values with an integer, but you don't mention assigning/comparing between Invoice_Number and Amount variables and values.  I believe you don't intend those to be legal.  What about arithmetic operations?  Can I subtract an Amount from an Invoice_Number?  Can I concatenate a Firstname with a blank and a LastName?

Am I right in assume there is no effect on RENAME and EXTEND as presently defined in TD?

Hugh

 

 

Coauthor of The Third Manifesto and related books.
Quote from Dave Voorhis on April 30, 2021, 2:44 pm

Thinking a bit more on 'DICTIONARY', etc., maybe it would be desirable to be able to do this:

DICTIONARY;
Customer_ID CHAR;
Customer_Phone, Customer_Phone2 INT;
Invoice_Number INT;
Invoice_Date Date;
Amount RATIONAL;
END DICTIONARY;
DICTIONARY; Customer_ID CHAR; Customer_Phone, Customer_Phone2 INT; Invoice_Number INT; Invoice_Date Date; Amount RATIONAL; END DICTIONARY;
DICTIONARY;
  Customer_ID CHAR;
  Customer_Phone, Customer_Phone2 INT;
  Invoice_Number INT;
  Invoice_Date Date;
  Amount RATIONAL;
END DICTIONARY;

So you can say this...

VAR Customers REAL RELATION {Customer_ID, Customer_Phone, Customer_Phone2} KEY {Customer_ID};
VAR Customers REAL RELATION {Customer_ID, Customer_Phone, Customer_Phone2} KEY {Customer_ID};
VAR Customers REAL RELATION {Customer_ID, Customer_Phone, Customer_Phone2} KEY {Customer_ID};

...which is shorthand for:

VAR Customers REAL RELATION {Customer_ID Customer_ID, Customer_Phone Customer_Phone, Customer_Phone2 Customer_Phone} KEY {Customer_ID};
VAR Customers REAL RELATION {Customer_ID Customer_ID, Customer_Phone Customer_Phone, Customer_Phone2 Customer_Phone} KEY {Customer_ID};
VAR Customers REAL RELATION {Customer_ID Customer_ID, Customer_Phone Customer_Phone, Customer_Phone2 Customer_Phone} KEY {Customer_ID};

 

Yet another shorthand given in advance of acceptance of the base idea.  We have to be sure that works before considering this addition.

I assume you mean that multiple element names on the same dictionary element are just synonyms.  Right?  If so, I'm reminded that synonyms sometimes give rise to problems, so I think this one might need more thought.

Hugh

Coauthor of The Third Manifesto and related books.
Quote from Hugh on April 30, 2021, 1:12 pm
Quote from tobega on April 29, 2021, 5:22 pm
Quote from Hugh on April 29, 2021, 3:06 pm
Quote from tobega on April 28, 2021, 3:25 pm
Quote from Hugh on April 28, 2021, 2:32 pm
Quote from Hugh on April 28, 2021, 10:36 am
Quote from tobega on April 28, 2021, 6:47 am

On the subject of type system for a language capable of hosting a D (and also for Tailspin, of course), we have observed that Tuples must be structurally typed, i.e. the attributes they contain define them as the product type of those attributes.

As a counterpoint to a previous thread here, I propose that Tuples be THE way to create product types.

The latest insight (or train-wreck) that I had, is that we should let attributes define types, i.e. instead of saying that an attribute has a type, we say that an attribute is a type. I think this fits very nicely with the natural join and that we take the position that things with the same name are the same kind of things. It also fits in with a good practice to create specific types for specific things, even if in Java it is a bit of a pain to e.g. create a SupplierName class that simply wraps a String.

So we would declare that there is a type called PNAME of the base type string, and the type called SNAME of the base type string, and you just use them as attributes in the Tuples, the type and the attribute have the same name.

Obviously you cannot assign an SNAME value to a PNAME attribute without casting it. But you could e.g. have a COMPANY_NAME and have SNAME be of the type COMPANY_NAME, which would enable assigning between the two.

So, comments? Good idea? Insane idea?

I've seen other replies.  It doesn't look like a good idea to me but in any case clarification is needed.  Please give examples of type definitions for, e.g., SNAME and PNAME, preferably using TD-like syntax.  I assume you imagine a relation type definition to be like TD's but with just attribute type names as heading components: REL{SNO, SNAME, CITY

What do you think a value of an attribute type looks like.   Please give a literal denoting the supplier name Smith.

What are the implications for the relational RENAME operator?

Hugh

P.S.  Perhaps more appropriate, what about EXTEND?  In particular, I have a query that involves extension with concatenation of FirstName and LastName (with a blank in between).  How is that done?

Answering once for both of your posts.

I wouldn't necessarily change anything from the TD syntax, if that is the syntax you want, so

TUPLE { S# S#, SNAME NAME, STATUS INTEGER, CITY CHAR }
TUPLE { S# S#('S1'), SNAME NAME('Smith'), STATUS 20, CITY 'London' }
TUPLE { S# S#, SNAME NAME, STATUS INTEGER, CITY CHAR } TUPLE { S# S#('S1'), SNAME NAME('Smith'), STATUS 20, CITY 'London' }
TUPLE { S# S#, SNAME NAME, STATUS INTEGER, CITY CHAR }

TUPLE { S# S#('S1'), SNAME NAME('Smith'), STATUS 20, CITY 'London' }

But what would happen is that we would now automatically also have the types SNAME, STATUS and CITY defined, and the "real" type of the CITY attribute would be CITY, but it could be assigned values of the representation type CHAR.

I believe I answered the RENAME case in my reply to Darren.

As for EXTEND, I suppose that would work similarly in that you would have to explicitly assert the conversion to appropriate types. So FirstName is what? CHAR? And LastName might also be CHAR. So lets look at TD syntax:

EXTEND a ADD (CHAR(FirstName)+' '+CHAR(LastName) AS FullName)
EXTEND a ADD (CHAR(FirstName)+' '+CHAR(LastName) AS FullName)
EXTEND a ADD (CHAR(FirstName)+' '+CHAR(LastName) AS FullName)

If FullName has not been previously defined, there will now be a type "FullName CHAR", and any attribute FullName can be assumed to be of type FullName, without needing specification. But it wouldn't necessarily hurt to respecify "FullName CHAR". However, trying to specify e.g. "FullName INT" would not be allowed anywhere, you would have to forego that option.

Oh, I didn't see this one until after I had responded to Dave Voorhis.  So I didn't misunderstand but you have now clarified.  The question now concerns the scope of this defined-on-the-fly type Fullname.  It has to be local to the expression in which it is defined, otherwise I would object strongly.  And it is is local to the expression, I can't see much point.

Hugh

There wouldn't be much point in having it just local to the expression. I propose that the specific type definitions for each attribute that Dave provided is a "best practice" or at least a "good practice", so we can just let them be automatically defined from the attribute definition. This would lead the developer in the right direction for the price of a slight inconvenience on the rare (?) occasion when you would have wanted an attribute with the same name but of a different type.

Thank you for confirming the pointlessness of a type definition local to the expression in which it is defined.  In that case, this particular aspect of your idea bothers me greatly.

First, it gets into the dodgy are of expression evaluation having side-effects.  That alone is sufficient grounds for rejection afaiacs.

Secondly,  what happens if  "exp1 AS FullName" and "exp2 AS FullName" specifications appear in the same overall relation expression, where exp1 andexp2 re of different types?

Thirdly, what happens if "exp2 AS FullName" appears in a subsequent statement (perhaps a year or so later, if you are really thinking of global scope)?

It is unthinkable, to my mind, that somebody innocently entering an ad hoc query might invalidate somebody else's ad hoc query in this way.

Sorry if points like this have already been raised and possibly addressed.  I know there has been a lot of correspondence but my brief glances have given me the impression that your idea has received some interest and even sympathy in some quarters.  So I might be missing something.

Hugh

Hugh

Thanks, Hugh, for very valid objections.

If an attribute with the same name is given a different type, wouldn't you want the compiler to warn you about this in case it is a bug waiting to happen? After all, in the natural join we take the stand that things named the same are the same, so shouldn't that apply more broadly?

In case you have a legitimate case for wanting different types, do you have a compelling need to use the same name? This kind of thing comes up in standardized xml messages where an "Address" is usually structured data with street, number, etc. Sometimes you need a free text form, then you name it somehing like "FreeFormAddress" or "AddressFreeForm".

You still may have a legitimate case for wanting the same name, but different type, so maybe we should allow you to do that, but with an explicit assertion that you wish to do so.

There is, of course, the concern that creating a simple attribute in an expression has a global effect of defining that name as a certain type. But, again, wouldn't the warning mostly be beneficial if you're using the same name for different things? There is perhaps a problem with truly temporary names, but we may come up with a scheme for those, maybe they can start with an underscore, for example.

While I think it likely that names should remain fairly constant within a module, I can see it might be problematic when you have code from different modules. I'm not sure yet how to solve that, but I have some ideas mulling.

As for the ad hoc query, I hadn't thought of that at all before because I'm thinking more about programs. Perhaps ad hoc queries should just proceed, but possibly with a warning? We may have a mechanism to disable the warnings locally or to create temporary names, as sketched above.

Nothing is ever completely free, so the question is if this still might be more beneficial than inconvenient?

Quote from Hugh on May 1, 2021, 3:12 pm
Quote from Dave Voorhis on April 30, 2021, 1:46 pm
Quote from Hugh on April 30, 2021, 1:12 pm
Quote from tobega on April 29, 2021, 5:22 pm
Quote from Hugh on April 29, 2021, 3:06 pm
Quote from tobega on April 28, 2021, 3:25 pm
Quote from Hugh on April 28, 2021, 2:32 pm
Quote from Hugh on April 28, 2021, 10:36 am
Quote from tobega on April 28, 2021, 6:47 am

On the subject of type system for a language capable of hosting a D (and also for Tailspin, of course), we have observed that Tuples must be structurally typed, i.e. the attributes they contain define them as the product type of those attributes.

As a counterpoint to a previous thread here, I propose that Tuples be THE way to create product types.

The latest insight (or train-wreck) that I had, is that we should let attributes define types, i.e. instead of saying that an attribute has a type, we say that an attribute is a type. I think this fits very nicely with the natural join and that we take the position that things with the same name are the same kind of things. It also fits in with a good practice to create specific types for specific things, even if in Java it is a bit of a pain to e.g. create a SupplierName class that simply wraps a String.

So we would declare that there is a type called PNAME of the base type string, and the type called SNAME of the base type string, and you just use them as attributes in the Tuples, the type and the attribute have the same name.

Obviously you cannot assign an SNAME value to a PNAME attribute without casting it. But you could e.g. have a COMPANY_NAME and have SNAME be of the type COMPANY_NAME, which would enable assigning between the two.

So, comments? Good idea? Insane idea?

I've seen other replies.  It doesn't look like a good idea to me but in any case clarification is needed.  Please give examples of type definitions for, e.g., SNAME and PNAME, preferably using TD-like syntax.  I assume you imagine a relation type definition to be like TD's but with just attribute type names as heading components: REL{SNO, SNAME, CITY

What do you think a value of an attribute type looks like.   Please give a literal denoting the supplier name Smith.

What are the implications for the relational RENAME operator?

Hugh

P.S.  Perhaps more appropriate, what about EXTEND?  In particular, I have a query that involves extension with concatenation of FirstName and LastName (with a blank in between).  How is that done?

Answering once for both of your posts.

I wouldn't necessarily change anything from the TD syntax, if that is the syntax you want, so

TUPLE { S# S#, SNAME NAME, STATUS INTEGER, CITY CHAR }
TUPLE { S# S#('S1'), SNAME NAME('Smith'), STATUS 20, CITY 'London' }
TUPLE { S# S#, SNAME NAME, STATUS INTEGER, CITY CHAR } TUPLE { S# S#('S1'), SNAME NAME('Smith'), STATUS 20, CITY 'London' }
TUPLE { S# S#, SNAME NAME, STATUS INTEGER, CITY CHAR }

TUPLE { S# S#('S1'), SNAME NAME('Smith'), STATUS 20, CITY 'London' }

But what would happen is that we would now automatically also have the types SNAME, STATUS and CITY defined, and the "real" type of the CITY attribute would be CITY, but it could be assigned values of the representation type CHAR.

I believe I answered the RENAME case in my reply to Darren.

As for EXTEND, I suppose that would work similarly in that you would have to explicitly assert the conversion to appropriate types. So FirstName is what? CHAR? And LastName might also be CHAR. So lets look at TD syntax:

EXTEND a ADD (CHAR(FirstName)+' '+CHAR(LastName) AS FullName)
EXTEND a ADD (CHAR(FirstName)+' '+CHAR(LastName) AS FullName)
EXTEND a ADD (CHAR(FirstName)+' '+CHAR(LastName) AS FullName)

If FullName has not been previously defined, there will now be a type "FullName CHAR", and any attribute FullName can be assumed to be of type FullName, without needing specification. But it wouldn't necessarily hurt to respecify "FullName CHAR". However, trying to specify e.g. "FullName INT" would not be allowed anywhere, you would have to forego that option.

Oh, I didn't see this one until after I had responded to Dave Voorhis.  So I didn't misunderstand but you have now clarified.  The question now concerns the scope of this defined-on-the-fly type Fullname.  It has to be local to the expression in which it is defined, otherwise I would object strongly.  And it is is local to the expression, I can't see much point.

Hugh

There wouldn't be much point in having it just local to the expression. I propose that the specific type definitions for each attribute that Dave provided is a "best practice" or at least a "good practice", so we can just let them be automatically defined from the attribute definition. This would lead the developer in the right direction for the price of a slight inconvenience on the rare (?) occasion when you would have wanted an attribute with the same name but of a different type.

Thank you for confirming the pointlessness of a type definition local to the expression in which it is defined.  In that case, this particular aspect of your idea bothers me greatly.

First, it gets into the dodgy are of expression evaluation having side-effects.  That alone is sufficient grounds for rejection afaiacs.

Secondly,  what happens if  "exp1 AS FullName" and "exp2 AS FullName" specifications appear in the same overall relation expression, where exp1 andexp2 re of different types?

Thirdly, what happens if "exp2 AS FullName" appears in a subsequent statement (perhaps a year or so later, if you are really thinking of global scope)?

It is unthinkable, to my mind, that somebody innocently entering an ad hoc query might invalidate somebody else's ad hoc query in this way.

Sorry if points like this have already been raised and possibly addressed.  I know there has been a lot of correspondence but my brief glances have given me the impression that your idea has received some interest and even sympathy in some quarters.  So I might be missing something.

Hugh

Hugh

Would something like this be more palatable?

DICTIONARY;
Customer_ID CHAR;
Customer_Phone INT;
Invoice_Number INT;
Invoice_Date Date;
Amount RATIONAL;
END DICTIONARY;
VAR Customers REAL RELATION {Customer_ID, Customer_Phone} KEY {Customer_ID};
VAR Invoices REAL RELATION {Invoice_Number, Invoice_Date, Customer_ID, Amount} KEY {Invoice_Number};
CONSTRAINT Invoices_Customers_FK Invoices {Customer_ID} ⊆ Customers {Customer_ID};
DICTIONARY; Customer_ID CHAR; Customer_Phone INT; Invoice_Number INT; Invoice_Date Date; Amount RATIONAL; END DICTIONARY; VAR Customers REAL RELATION {Customer_ID, Customer_Phone} KEY {Customer_ID}; VAR Invoices REAL RELATION {Invoice_Number, Invoice_Date, Customer_ID, Amount} KEY {Invoice_Number}; CONSTRAINT Invoices_Customers_FK Invoices {Customer_ID} ⊆ Customers {Customer_ID};
DICTIONARY;
  Customer_ID CHAR;
  Customer_Phone INT;
  Invoice_Number INT;
  Invoice_Date Date;
  Amount RATIONAL;
END DICTIONARY;

VAR Customers REAL RELATION {Customer_ID, Customer_Phone} KEY {Customer_ID};
VAR Invoices REAL RELATION {Invoice_Number, Invoice_Date, Customer_ID, Amount} KEY {Invoice_Number};

CONSTRAINT Invoices_Customers_FK  Invoices {Customer_ID} ⊆ Customers {Customer_ID};

The effect of the DICTIONARY block is to define a data dictionary of attributes, which implicitly creates the following types...

TYPE Customer_ID POSSREP {Value CHAR};
TYPE Customer_Phone POSSREP {Value INT};
TYPE Invoice_Number POSSREP {Value INT};
TYPE Invoice_Date POSSREP {Value Date};
TYPE Amount POSSREP {Value RATIONAL};
TYPE Customer_ID POSSREP {Value CHAR}; TYPE Customer_Phone POSSREP {Value INT}; TYPE Invoice_Number POSSREP {Value INT}; TYPE Invoice_Date POSSREP {Value Date}; TYPE Amount POSSREP {Value RATIONAL};
TYPE Customer_ID POSSREP {Value CHAR};
TYPE Customer_Phone POSSREP {Value INT}; 
TYPE Invoice_Number POSSREP {Value INT}; 
TYPE Invoice_Date POSSREP {Value Date}; 
TYPE Amount POSSREP {Value RATIONAL};

...and allows declarations like

VAR Invoices REAL RELATION {Invoice_Number, Invoice_Date, Customer_ID, Amount} KEY {Invoice_Number};

VAR Invoices REAL RELATION {Invoice_Number, Invoice_Date, Customer_ID, Amount} KEY {Invoice_Number}; to be shorthand for:

VAR Invoices REAL RELATION {Invoice_Number Invoice_Number, Invoice_Date Invoice_Date, Customer_ID Customer_ID, Amount Amount} KEY {Invoice_Number};
VAR Invoices REAL RELATION {Invoice_Number Invoice_Number, Invoice_Date Invoice_Date, Customer_ID Customer_ID, Amount Amount} KEY {Invoice_Number};
VAR Invoices REAL RELATION {Invoice_Number Invoice_Number, Invoice_Date Invoice_Date, Customer_ID Customer_ID, Amount Amount} KEY {Invoice_Number};

This allows the compiler to check that we're not going to accidently multiply a Customer_Phone by an Invoice_Number, etc.

Ideally, any declaration of the form <identifier> <type> should be replaceable with <dictionary_name>, e.g. this...

VAR Invoice_Number INIT(33);
VAR Invoice_Number INIT(33);
VAR Invoice_Number INIT(33);

...is the same as this:

VAR Invoice_Number Invoice_Number INIT(33);
VAR Invoice_Number Invoice_Number INIT(33);
VAR Invoice_Number Invoice_Number INIT(33);

I imagine it would an entirely optional feature; no existing Tutorial D code would be broken by adding the facility, and anyone who doesn't want to use DICTIONARY and the <dictionary_name> shorthands for <identifier> <type> could ignore them.

This is a bit too much to digest.  It would be better not to include all these shorthands to begin with (such as omitting the type on a VAR declaration to make it default to the variable name (or are we omitting the variable name and making it default to the type name?)).

Actually, the whole reason for having the DICTIONARY ... END DICTIONARY declaration and being able to use <name> <type> --> <dictionary name> shorthands is because the shorthands are desirable. Of course, you can do everything in Tutorial D without the DICTIONARY ... END DICTIONARY declaration or being able to use <name> <type> --> <dictionary name> shorthands...

Or, the <name> <type> --> <dictionary name> shorthands could be removed and just have DICTIONARY, but it would mean a lot of unnecessarily repetitive declarations like:

VAR Invoices REAL RELATION {Invoice_Number Invoice_Number, Invoice_Date Invoice_Date, Customer_ID Customer_ID, Amount Amount} KEY {Invoice_Number};

Anyway, it doesn't seem anything like Tobega's idea.  Is it solving or addressing the same problem(s) as Tobega's?

It is notionally addressing the same problem as Tobega's, but taking a different approach that also addresses a desire for data dictionaries, mentioned elsewhere in this thread.

I conclude from the examples that dictionary element names are independent from attributes of headings.  That's okay, but you do call a dictionary a dictionary of attributes whereas VAR Invoice_Number Invoice_Number INIT(33) shows Invoice_Number be used for something other than an attribute.

Dictionary element names define identifier type pairs. They could be restricted to attributes of headings, but it might seem arbitrarily restrictive to allow a dictionary name to be used in place of identifier type in some places but not others.

Also, it seems that I can assign and integer to an Invoice_Number variable, and I can compare an Invoice_Number values with an integer, but you don't mention assigning/comparing between Invoice_Number and Amount variables and values.  I believe you don't intend those to be legal.  What about arithmetic operations?  Can I subtract an Amount from an Invoice_Number?  Can I concatenate a Firstname with a blank and a LastName?

Assigning an integer to an Invoice_Number variable was a careless mistake. It should have been...

VAR Invoice_Number INIT(Invoice_Number(33));

...which is equivalent to:

VAR Invoice_Number Invoice_Number INIT(Invoice_Number(33));

Am I right in assume there is no effect on RENAME and EXTEND as presently defined in TD?

It has no effect on RENAME, EXTEND, or anything else, except where <name> <type> can currently be used to declare a variable, parameter, or heading attribute (have I missed anything?) you could also use <dictionary_name> instead of <name> <type> -- assuming there is a dictionary entry of the form <dictionary_name> <type>, which automatically creates TYPE <dictionary_name> POSSREP {Value <type>} -- so that using <dictionary_name> in a variable, parameter, or heading attribute declaration instead of <name> <type> is shorthand for <dictionary_name> <dictionary_name>.

E.g., given:

DICTIONARY;
  X CHAR;
END DICTIONARY;

The following will be implicitly created:

TYPE X POSSREP {Value CHAR}

[Addendum: I wonder if it might be useful to be able to optionally specify in the DICTIONARY section those elements that are not to be wrapped and are to be defined as the specified type. E.g., something like 'Customer_ID INT UNWRAP' means that rather than automatically creating TYPE Customer_ID POSSREP {Value INT} and declaring attributes/variables/parameters named Customer_ID as type Customer_ID, they'd be declared to be type INT.]

And a declaration like...

VAR X;

...is shorthand for:

VAR X X;

In short, it provides the "classic" features of a data dictionary, with added type safety.

Quote from Hugh on May 1, 2021, 3:16 pm
Quote from Dave Voorhis on April 30, 2021, 2:44 pm

Thinking a bit more on 'DICTIONARY', etc., maybe it would be desirable to be able to do this:

DICTIONARY;
Customer_ID CHAR;
Customer_Phone, Customer_Phone2 INT;
Invoice_Number INT;
Invoice_Date Date;
Amount RATIONAL;
END DICTIONARY;
DICTIONARY; Customer_ID CHAR; Customer_Phone, Customer_Phone2 INT; Invoice_Number INT; Invoice_Date Date; Amount RATIONAL; END DICTIONARY;
DICTIONARY;
  Customer_ID CHAR;
  Customer_Phone, Customer_Phone2 INT;
  Invoice_Number INT;
  Invoice_Date Date;
  Amount RATIONAL;
END DICTIONARY;

So you can say this...

VAR Customers REAL RELATION {Customer_ID, Customer_Phone, Customer_Phone2} KEY {Customer_ID};
VAR Customers REAL RELATION {Customer_ID, Customer_Phone, Customer_Phone2} KEY {Customer_ID};
VAR Customers REAL RELATION {Customer_ID, Customer_Phone, Customer_Phone2} KEY {Customer_ID};

...which is shorthand for:

VAR Customers REAL RELATION {Customer_ID Customer_ID, Customer_Phone Customer_Phone, Customer_Phone2 Customer_Phone} KEY {Customer_ID};
VAR Customers REAL RELATION {Customer_ID Customer_ID, Customer_Phone Customer_Phone, Customer_Phone2 Customer_Phone} KEY {Customer_ID};
VAR Customers REAL RELATION {Customer_ID Customer_ID, Customer_Phone Customer_Phone, Customer_Phone2 Customer_Phone} KEY {Customer_ID};

Yet another shorthand given in advance of acceptance of the base idea.  We have to be sure that works before considering this addition.

I assume you mean that multiple element names on the same dictionary element are just synonyms.  Right?  If so, I'm reminded that synonyms sometimes give rise to problems, so I think this one might need more thought.

If the DICTIONARY idea is acceptable, it's arguably necessary rather than being an (optional?) addition.

I would definitely not describe multiple element names on the same dictionary element as synonyms. Multiple element names on the same dictionary element define distinct dictionary elements of the same type.

It allows you to declare multiple attributes in a relvar to have the same type using DICTIONARY entries to specify them, which you otherwise couldn't do using DICTIONARY entries.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
Quote from tobega on May 1, 2021, 3:34 pm
Quote from Hugh on April 30, 2021, 1:12 pm
Quote from tobega on April 29, 2021, 5:22 pm
Quote from Hugh on April 29, 2021, 3:06 pm
Quote from tobega on April 28, 2021, 3:25 pm
Quote from Hugh on April 28, 2021, 2:32 pm
Quote from Hugh on April 28, 2021, 10:36 am
Quote from tobega on April 28, 2021, 6:47 am

On the subject of type system for a language capable of hosting a D (and also for Tailspin, of course), we have observed that Tuples must be structurally typed, i.e. the attributes they contain define them as the product type of those attributes.

As a counterpoint to a previous thread here, I propose that Tuples be THE way to create product types.

The latest insight (or train-wreck) that I had, is that we should let attributes define types, i.e. instead of saying that an attribute has a type, we say that an attribute is a type. I think this fits very nicely with the natural join and that we take the position that things with the same name are the same kind of things. It also fits in with a good practice to create specific types for specific things, even if in Java it is a bit of a pain to e.g. create a SupplierName class that simply wraps a String.

So we would declare that there is a type called PNAME of the base type string, and the type called SNAME of the base type string, and you just use them as attributes in the Tuples, the type and the attribute have the same name.

Obviously you cannot assign an SNAME value to a PNAME attribute without casting it. But you could e.g. have a COMPANY_NAME and have SNAME be of the type COMPANY_NAME, which would enable assigning between the two.

So, comments? Good idea? Insane idea?

I've seen other replies.  It doesn't look like a good idea to me but in any case clarification is needed.  Please give examples of type definitions for, e.g., SNAME and PNAME, preferably using TD-like syntax.  I assume you imagine a relation type definition to be like TD's but with just attribute type names as heading components: REL{SNO, SNAME, CITY

What do you think a value of an attribute type looks like.   Please give a literal denoting the supplier name Smith.

What are the implications for the relational RENAME operator?

Hugh

P.S.  Perhaps more appropriate, what about EXTEND?  In particular, I have a query that involves extension with concatenation of FirstName and LastName (with a blank in between).  How is that done?

Answering once for both of your posts.

I wouldn't necessarily change anything from the TD syntax, if that is the syntax you want, so

TUPLE { S# S#, SNAME NAME, STATUS INTEGER, CITY CHAR }
TUPLE { S# S#('S1'), SNAME NAME('Smith'), STATUS 20, CITY 'London' }
TUPLE { S# S#, SNAME NAME, STATUS INTEGER, CITY CHAR } TUPLE { S# S#('S1'), SNAME NAME('Smith'), STATUS 20, CITY 'London' }
TUPLE { S# S#, SNAME NAME, STATUS INTEGER, CITY CHAR }

TUPLE { S# S#('S1'), SNAME NAME('Smith'), STATUS 20, CITY 'London' }

But what would happen is that we would now automatically also have the types SNAME, STATUS and CITY defined, and the "real" type of the CITY attribute would be CITY, but it could be assigned values of the representation type CHAR.

I believe I answered the RENAME case in my reply to Darren.

As for EXTEND, I suppose that would work similarly in that you would have to explicitly assert the conversion to appropriate types. So FirstName is what? CHAR? And LastName might also be CHAR. So lets look at TD syntax:

EXTEND a ADD (CHAR(FirstName)+' '+CHAR(LastName) AS FullName)
EXTEND a ADD (CHAR(FirstName)+' '+CHAR(LastName) AS FullName)
EXTEND a ADD (CHAR(FirstName)+' '+CHAR(LastName) AS FullName)

If FullName has not been previously defined, there will now be a type "FullName CHAR", and any attribute FullName can be assumed to be of type FullName, without needing specification. But it wouldn't necessarily hurt to respecify "FullName CHAR". However, trying to specify e.g. "FullName INT" would not be allowed anywhere, you would have to forego that option.

Oh, I didn't see this one until after I had responded to Dave Voorhis.  So I didn't misunderstand but you have now clarified.  The question now concerns the scope of this defined-on-the-fly type Fullname.  It has to be local to the expression in which it is defined, otherwise I would object strongly.  And it is is local to the expression, I can't see much point.

Hugh

There wouldn't be much point in having it just local to the expression. I propose that the specific type definitions for each attribute that Dave provided is a "best practice" or at least a "good practice", so we can just let them be automatically defined from the attribute definition. This would lead the developer in the right direction for the price of a slight inconvenience on the rare (?) occasion when you would have wanted an attribute with the same name but of a different type.

Thank you for confirming the pointlessness of a type definition local to the expression in which it is defined.  In that case, this particular aspect of your idea bothers me greatly.

First, it gets into the dodgy are of expression evaluation having side-effects.  That alone is sufficient grounds for rejection afaiacs.

Secondly,  what happens if  "exp1 AS FullName" and "exp2 AS FullName" specifications appear in the same overall relation expression, where exp1 andexp2 re of different types?

Thirdly, what happens if "exp2 AS FullName" appears in a subsequent statement (perhaps a year or so later, if you are really thinking of global scope)?

It is unthinkable, to my mind, that somebody innocently entering an ad hoc query might invalidate somebody else's ad hoc query in this way.

Sorry if points like this have already been raised and possibly addressed.  I know there has been a lot of correspondence but my brief glances have given me the impression that your idea has received some interest and even sympathy in some quarters.  So I might be missing something.

Hugh

Thanks, Hugh, for very valid objections.

If an attribute with the same name is given a different type, wouldn't you want the compiler to warn you about this in case it is a bug waiting to happen? After all, in the natural join we take the stand that things named the same are the same, so shouldn't that apply more broadly?

HD: A warning is something the complier gives when it can complete the operation but sees a need to alert the user to a possible oversight.  In this case it surely has to be an out-and-out error.

In case you have a legitimate case for wanting different types, do you have a compelling need to use the same name? This kind of thing comes up in standardized xml messages where an "Address" is usually structured data with street, number, etc. Sometimes you need a free text form, then you name it something like "FreeFormAddress" or "AddressFreeForm".

HD: Whether the need is compelling or not depends on the circumstances.  In any case my putative ad hoc query user is going to be very annoyed at having to think of a name that has not already been used.  That will involve a catalog query to see all the type names, considering that all users of the database might have given all sorts of queries, ad hoc or planned, over  the years, generating a plethora of type names.

You still may have a legitimate case for wanting the same name, but different type, so maybe we should allow you to do that, but with an explicit assertion that you wish to do so.

HD: I've not come across the idea of overloading type names before.  I can't immediately make sense of the idea.

There is, of course, the concern that creating a simple attribute in an expression has a global effect of defining that name as a certain type. But, again, wouldn't the warning mostly be beneficial if you're using the same name for different things? There is perhaps a problem with truly temporary names, but we may come up with a scheme for those, maybe they can start with an underscore, for example.

HD: No further comment.

While I think it likely that names should remain fairly constant within a module, I can see it might be problematic when you have code from different modules. I'm not sure yet how to solve that, but I have some ideas mulling.

HD: Sorry, but I think you are wasting your time.  Your idea is a nonstarter imo.   In all programming languages I'm aware of the type system supports what is called type inference.  The type of an expression and the value it denotes is the type of the final operator in that expression.

As for the ad hoc query, I hadn't thought of that at all before because I'm thinking more about programs. Perhaps ad hoc queries should just proceed, but possibly with a warning? We may have a mechanism to disable the warnings locally or to create temporary names, as sketched above.

HD: No further comment.

Nothing is ever completely free, so the question is if this still might be more beneficial than inconvenient?

HD: I can't evaluate "more beneficial" because I still don't have a 100% clear understanding of the problem you are seeking to solve.  I imagine you are trying to avoid inappropriate use of operators, such as taking the average of a set of part numbers or comparing part numbers with supplier numbers, but I don't know what operators you imagine to be defined for your attribute types, nor do I understand how such types are used outside of headings.  Sorry if answers lie in previous correspondence under a different rubric that I wasn't following.

Responses given in-line above.

Hugh

Coauthor of The Third Manifesto and related books.
Quote from Dave Voorhis on May 1, 2021, 7:17 pm
Quote from Hugh on May 1, 2021, 3:12 pm
Quote from Dave Voorhis on April 30, 2021, 1:46 pm
Quote from Hugh on April 30, 2021, 1:12 pm
Quote from tobega on April 29, 2021, 5:22 pm
Quote from Hugh on April 29, 2021, 3:06 pm
Quote from tobega on April 28, 2021, 3:25 pm
Quote from Hugh on April 28, 2021, 2:32 pm
Quote from Hugh on April 28, 2021, 10:36 am
Quote from tobega on April 28, 2021, 6:47 am

On the subject of type system for a language capable of hosting a D (and also for Tailspin, of course), we have observed that Tuples must be structurally typed, i.e. the attributes they contain define them as the product type of those attributes.

As a counterpoint to a previous thread here, I propose that Tuples be THE way to create product types.

The latest insight (or train-wreck) that I had, is that we should let attributes define types, i.e. instead of saying that an attribute has a type, we say that an attribute is a type. I think this fits very nicely with the natural join and that we take the position that things with the same name are the same kind of things. It also fits in with a good practice to create specific types for specific things, even if in Java it is a bit of a pain to e.g. create a SupplierName class that simply wraps a String.

So we would declare that there is a type called PNAME of the base type string, and the type called SNAME of the base type string, and you just use them as attributes in the Tuples, the type and the attribute have the same name.

Obviously you cannot assign an SNAME value to a PNAME attribute without casting it. But you could e.g. have a COMPANY_NAME and have SNAME be of the type COMPANY_NAME, which would enable assigning between the two.

So, comments? Good idea? Insane idea?

I've seen other replies.  It doesn't look like a good idea to me but in any case clarification is needed.  Please give examples of type definitions for, e.g., SNAME and PNAME, preferably using TD-like syntax.  I assume you imagine a relation type definition to be like TD's but with just attribute type names as heading components: REL{SNO, SNAME, CITY

What do you think a value of an attribute type looks like.   Please give a literal denoting the supplier name Smith.

What are the implications for the relational RENAME operator?

Hugh

P.S.  Perhaps more appropriate, what about EXTEND?  In particular, I have a query that involves extension with concatenation of FirstName and LastName (with a blank in between).  How is that done?

Answering once for both of your posts.

I wouldn't necessarily change anything from the TD syntax, if that is the syntax you want, so

TUPLE { S# S#, SNAME NAME, STATUS INTEGER, CITY CHAR }
TUPLE { S# S#('S1'), SNAME NAME('Smith'), STATUS 20, CITY 'London' }
TUPLE { S# S#, SNAME NAME, STATUS INTEGER, CITY CHAR } TUPLE { S# S#('S1'), SNAME NAME('Smith'), STATUS 20, CITY 'London' }
TUPLE { S# S#, SNAME NAME, STATUS INTEGER, CITY CHAR }

TUPLE { S# S#('S1'), SNAME NAME('Smith'), STATUS 20, CITY 'London' }

But what would happen is that we would now automatically also have the types SNAME, STATUS and CITY defined, and the "real" type of the CITY attribute would be CITY, but it could be assigned values of the representation type CHAR.

I believe I answered the RENAME case in my reply to Darren.

As for EXTEND, I suppose that would work similarly in that you would have to explicitly assert the conversion to appropriate types. So FirstName is what? CHAR? And LastName might also be CHAR. So lets look at TD syntax:

EXTEND a ADD (CHAR(FirstName)+' '+CHAR(LastName) AS FullName)
EXTEND a ADD (CHAR(FirstName)+' '+CHAR(LastName) AS FullName)
EXTEND a ADD (CHAR(FirstName)+' '+CHAR(LastName) AS FullName)

If FullName has not been previously defined, there will now be a type "FullName CHAR", and any attribute FullName can be assumed to be of type FullName, without needing specification. But it wouldn't necessarily hurt to respecify "FullName CHAR". However, trying to specify e.g. "FullName INT" would not be allowed anywhere, you would have to forego that option.

Oh, I didn't see this one until after I had responded to Dave Voorhis.  So I didn't misunderstand but you have now clarified.  The question now concerns the scope of this defined-on-the-fly type Fullname.  It has to be local to the expression in which it is defined, otherwise I would object strongly.  And it is is local to the expression, I can't see much point.

Hugh

There wouldn't be much point in having it just local to the expression. I propose that the specific type definitions for each attribute that Dave provided is a "best practice" or at least a "good practice", so we can just let them be automatically defined from the attribute definition. This would lead the developer in the right direction for the price of a slight inconvenience on the rare (?) occasion when you would have wanted an attribute with the same name but of a different type.

Thank you for confirming the pointlessness of a type definition local to the expression in which it is defined.  In that case, this particular aspect of your idea bothers me greatly.

First, it gets into the dodgy are of expression evaluation having side-effects.  That alone is sufficient grounds for rejection afaiacs.

Secondly,  what happens if  "exp1 AS FullName" and "exp2 AS FullName" specifications appear in the same overall relation expression, where exp1 andexp2 re of different types?

Thirdly, what happens if "exp2 AS FullName" appears in a subsequent statement (perhaps a year or so later, if you are really thinking of global scope)?

It is unthinkable, to my mind, that somebody innocently entering an ad hoc query might invalidate somebody else's ad hoc query in this way.

Sorry if points like this have already been raised and possibly addressed.  I know there has been a lot of correspondence but my brief glances have given me the impression that your idea has received some interest and even sympathy in some quarters.  So I might be missing something.

Hugh

Hugh

Would something like this be more palatable?

DICTIONARY;
Customer_ID CHAR;
Customer_Phone INT;
Invoice_Number INT;
Invoice_Date Date;
Amount RATIONAL;
END DICTIONARY;
VAR Customers REAL RELATION {Customer_ID, Customer_Phone} KEY {Customer_ID};
VAR Invoices REAL RELATION {Invoice_Number, Invoice_Date, Customer_ID, Amount} KEY {Invoice_Number};
CONSTRAINT Invoices_Customers_FK Invoices {Customer_ID} ⊆ Customers {Customer_ID};
DICTIONARY; Customer_ID CHAR; Customer_Phone INT; Invoice_Number INT; Invoice_Date Date; Amount RATIONAL; END DICTIONARY; VAR Customers REAL RELATION {Customer_ID, Customer_Phone} KEY {Customer_ID}; VAR Invoices REAL RELATION {Invoice_Number, Invoice_Date, Customer_ID, Amount} KEY {Invoice_Number}; CONSTRAINT Invoices_Customers_FK Invoices {Customer_ID} ⊆ Customers {Customer_ID};
DICTIONARY;
  Customer_ID CHAR;
  Customer_Phone INT;
  Invoice_Number INT;
  Invoice_Date Date;
  Amount RATIONAL;
END DICTIONARY;

VAR Customers REAL RELATION {Customer_ID, Customer_Phone} KEY {Customer_ID};
VAR Invoices REAL RELATION {Invoice_Number, Invoice_Date, Customer_ID, Amount} KEY {Invoice_Number};

CONSTRAINT Invoices_Customers_FK  Invoices {Customer_ID} ⊆ Customers {Customer_ID};

The effect of the DICTIONARY block is to define a data dictionary of attributes, which implicitly creates the following types...

TYPE Customer_ID POSSREP {Value CHAR};
TYPE Customer_Phone POSSREP {Value INT};
TYPE Invoice_Number POSSREP {Value INT};
TYPE Invoice_Date POSSREP {Value Date};
TYPE Amount POSSREP {Value RATIONAL};
TYPE Customer_ID POSSREP {Value CHAR}; TYPE Customer_Phone POSSREP {Value INT}; TYPE Invoice_Number POSSREP {Value INT}; TYPE Invoice_Date POSSREP {Value Date}; TYPE Amount POSSREP {Value RATIONAL};
TYPE Customer_ID POSSREP {Value CHAR};
TYPE Customer_Phone POSSREP {Value INT}; 
TYPE Invoice_Number POSSREP {Value INT}; 
TYPE Invoice_Date POSSREP {Value Date}; 
TYPE Amount POSSREP {Value RATIONAL};

...and allows declarations like

VAR Invoices REAL RELATION {Invoice_Number, Invoice_Date, Customer_ID, Amount} KEY {Invoice_Number};

VAR Invoices REAL RELATION {Invoice_Number, Invoice_Date, Customer_ID, Amount} KEY {Invoice_Number}; to be shorthand for:

VAR Invoices REAL RELATION {Invoice_Number Invoice_Number, Invoice_Date Invoice_Date, Customer_ID Customer_ID, Amount Amount} KEY {Invoice_Number};
VAR Invoices REAL RELATION {Invoice_Number Invoice_Number, Invoice_Date Invoice_Date, Customer_ID Customer_ID, Amount Amount} KEY {Invoice_Number};
VAR Invoices REAL RELATION {Invoice_Number Invoice_Number, Invoice_Date Invoice_Date, Customer_ID Customer_ID, Amount Amount} KEY {Invoice_Number};

This allows the compiler to check that we're not going to accidently multiply a Customer_Phone by an Invoice_Number, etc.

Ideally, any declaration of the form <identifier> <type> should be replaceable with <dictionary_name>, e.g. this...

VAR Invoice_Number INIT(33);
VAR Invoice_Number INIT(33);
VAR Invoice_Number INIT(33);

...is the same as this:

VAR Invoice_Number Invoice_Number INIT(33);
VAR Invoice_Number Invoice_Number INIT(33);
VAR Invoice_Number Invoice_Number INIT(33);

I imagine it would an entirely optional feature; no existing Tutorial D code would be broken by adding the facility, and anyone who doesn't want to use DICTIONARY and the <dictionary_name> shorthands for <identifier> <type> could ignore them.

This is a bit too much to digest.  It would be better not to include all these shorthands to begin with (such as omitting the type on a VAR declaration to make it default to the variable name (or are we omitting the variable name and making it default to the type name?)).

Actually, the whole reason for having the DICTIONARY ... END DICTIONARY declaration and being able to use <name> <type> --> <dictionary name> shorthands is because the shorthands are desirable. Of course, you can do everything in Tutorial D without the DICTIONARY ... END DICTIONARY declaration or being able to use <name> <type> --> <dictionary name> shorthands...

Or, the <name> <type> --> <dictionary name> shorthands could be removed and just have DICTIONARY, but it would mean a lot of unnecessarily repetitive declarations like:

VAR Invoices REAL RELATION {Invoice_Number Invoice_Number, Invoice_Date Invoice_Date, Customer_ID Customer_ID, Amount Amount} KEY {Invoice_Number};
VAR Invoices REAL RELATION {Invoice_Number Invoice_Number, Invoice_Date Invoice_Date, Customer_ID Customer_ID, Amount Amount} KEY {Invoice_Number};
VAR Invoices REAL RELATION {Invoice_Number Invoice_Number, Invoice_Date Invoice_Date, Customer_ID Customer_ID, Amount Amount} KEY {Invoice_Number};

Anyway, it doesn't seem anything like Tobega's idea.  Is it solving or addressing the same problem(s) as Tobega's?

It is notionally addressing the same problem as Tobega's, but taking a different approach that also addresses a desire for data dictionaries, mentioned elsewhere in this thread.

I conclude from the examples that dictionary element names are independent from attributes of headings.  That's okay, but you do call a dictionary a dictionary of attributes whereas VAR Invoice_Number Invoice_Number INIT(33) shows Invoice_Number be used for something other than an attribute.

Dictionary element names define identifier type pairs. They could be restricted to attributes of headings, but it might seem arbitrarily restrictive to allow a dictionary name to be used in place of identifier type in some places but not others.

Also, it seems that I can assign and integer to an Invoice_Number variable, and I can compare an Invoice_Number values with an integer, but you don't mention assigning/comparing between Invoice_Number and Amount variables and values.  I believe you don't intend those to be legal.  What about arithmetic operations?  Can I subtract an Amount from an Invoice_Number?  Can I concatenate a Firstname with a blank and a LastName?

Assigning an integer to an Invoice_Number variable was a careless mistake. It should have been...

VAR Invoice_Number INIT(Invoice_Number(33));
VAR Invoice_Number INIT(Invoice_Number(33));
VAR Invoice_Number INIT(Invoice_Number(33));

...which is equivalent to:

VAR Invoice_Number Invoice_Number INIT(Invoice_Number(33));
VAR Invoice_Number Invoice_Number INIT(Invoice_Number(33));
VAR Invoice_Number Invoice_Number INIT(Invoice_Number(33));

Am I right in assume there is no effect on RENAME and EXTEND as presently defined in TD?

It has no effect on RENAME, EXTEND, or anything else, except where <name> <type> can currently be used to declare a variable, parameter, or heading attribute (have I missed anything?) you could also use <dictionary_name> instead of <name> <type> -- assuming there is a dictionary entry of the form <dictionary_name> <type>, which automatically creates TYPE <dictionary_name> POSSREP {Value <type>} -- so that using <dictionary_name> in a variable, parameter, or heading attribute declaration instead of <name> <type> is shorthand for <dictionary_name> <dictionary_name>.

E.g., given:

DICTIONARY;
X CHAR;
END DICTIONARY;
DICTIONARY; X CHAR; END DICTIONARY;
DICTIONARY;
  X CHAR;
END DICTIONARY;

The following will be implicitly created:

TYPE X POSSREP {Value CHAR}
TYPE X POSSREP {Value CHAR}
TYPE X POSSREP {Value CHAR}

[Addendum: I wonder if it might be useful to be able to optionally specify in the DICTIONARY section those elements that are not to be wrapped and are to be defined as the specified type. E.g., something like 'Customer_ID INT UNWRAP' means that rather than automatically creating TYPE Customer_ID POSSREP {Value INT} and declaring attributes/variables/parameters named Customer_ID as type Customer_ID, they'd be declared to be type INT.]

And a declaration like...

VAR X;
VAR X;
VAR X;

...is shorthand for:

VAR X X;
VAR X X;
VAR X X;

In short, it provides the "classic" features of a data dictionary, with added type safety.

Quote from Hugh on May 1, 2021, 3:16 pm
Quote from Dave Voorhis on April 30, 2021, 2:44 pm

Thinking a bit more on 'DICTIONARY', etc., maybe it would be desirable to be able to do this:

DICTIONARY;
Customer_ID CHAR;
Customer_Phone, Customer_Phone2 INT;
Invoice_Number INT;
Invoice_Date Date;
Amount RATIONAL;
END DICTIONARY;
DICTIONARY; Customer_ID CHAR; Customer_Phone, Customer_Phone2 INT; Invoice_Number INT; Invoice_Date Date; Amount RATIONAL; END DICTIONARY;
DICTIONARY;
  Customer_ID CHAR;
  Customer_Phone, Customer_Phone2 INT;
  Invoice_Number INT;
  Invoice_Date Date;
  Amount RATIONAL;
END DICTIONARY;

So you can say this...

VAR Customers REAL RELATION {Customer_ID, Customer_Phone, Customer_Phone2} KEY {Customer_ID};
VAR Customers REAL RELATION {Customer_ID, Customer_Phone, Customer_Phone2} KEY {Customer_ID};
VAR Customers REAL RELATION {Customer_ID, Customer_Phone, Customer_Phone2} KEY {Customer_ID};

...which is shorthand for:

VAR Customers REAL RELATION {Customer_ID Customer_ID, Customer_Phone Customer_Phone, Customer_Phone2 Customer_Phone} KEY {Customer_ID};
VAR Customers REAL RELATION {Customer_ID Customer_ID, Customer_Phone Customer_Phone, Customer_Phone2 Customer_Phone} KEY {Customer_ID};
VAR Customers REAL RELATION {Customer_ID Customer_ID, Customer_Phone Customer_Phone, Customer_Phone2 Customer_Phone} KEY {Customer_ID};

Yet another shorthand given in advance of acceptance of the base idea.  We have to be sure that works before considering this addition.

I assume you mean that multiple element names on the same dictionary element are just synonyms.  Right?  If so, I'm reminded that synonyms sometimes give rise to problems, so I think this one might need more thought.

If the DICTIONARY idea is acceptable, it's arguably necessary rather than being an (optional?) addition.

I would definitely not describe multiple element names on the same dictionary element as synonyms. Multiple element names on the same dictionary element define distinct dictionary elements of the same type.

It allows you to declare multiple attributes in a relvar to have the same type using DICTIONARY entries to specify them, which you otherwise couldn't do using DICTIONARY entries.

I misunderstood your use of multiple elements names on the same dictionary element because your example gives two distinct dictionary elements using the same underlying type CHAR.

I wouldn't object to any genuine shorthand, though I might have an opinion on its value.  It's difficult for me to judge the wisdom of this one because (a) I don't have a full understanding of the problem it seeks to address (saying it's the same as Tobega's doesn't help me), and (b) the extent to which you are addressing it isn't clear either.  You provide a shorthand for defining types that have a single possrep with a single possrep component, without defining any operators in addition to those systematically implied  by the type definition (such as THE_Value(Customer_Id)).  That tells me that solutions for the perceived problem are already available in TD as defined.  Do dictionaries offer any additional advantages?

Regarding the perceived problem, is it really just concerned with ill-advised comparisons, especially those that can arise "by accident", being implicitly involved in operations such as join?  You might want to outlaw taking averages of part numbers too, but if anybody really wants to go out of their way do that they probably do have some good reason!

Hugh

Coauthor of The Third Manifesto and related books.
Quote from Hugh on May 3, 2021, 2:25 pm
Quote from Dave Voorhis on May 1, 2021, 7:17 pm
Quote from Hugh on May 1, 2021, 3:12 pm
Quote from Dave Voorhis on April 30, 2021, 1:46 pm
Quote from Hugh on April 30, 2021, 1:12 pm
Quote from tobega on April 29, 2021, 5:22 pm
Quote from Hugh on April 29, 2021, 3:06 pm
Quote from tobega on April 28, 2021, 3:25 pm
Quote from Hugh on April 28, 2021, 2:32 pm
Quote from Hugh on April 28, 2021, 10:36 am
Quote from tobega on April 28, 2021, 6:47 am

On the subject of type system for a language capable of hosting a D (and also for Tailspin, of course), we have observed that Tuples must be structurally typed, i.e. the attributes they contain define them as the product type of those attributes.

As a counterpoint to a previous thread here, I propose that Tuples be THE way to create product types.

The latest insight (or train-wreck) that I had, is that we should let attributes define types, i.e. instead of saying that an attribute has a type, we say that an attribute is a type. I think this fits very nicely with the natural join and that we take the position that things with the same name are the same kind of things. It also fits in with a good practice to create specific types for specific things, even if in Java it is a bit of a pain to e.g. create a SupplierName class that simply wraps a String.

So we would declare that there is a type called PNAME of the base type string, and the type called SNAME of the base type string, and you just use them as attributes in the Tuples, the type and the attribute have the same name.

Obviously you cannot assign an SNAME value to a PNAME attribute without casting it. But you could e.g. have a COMPANY_NAME and have SNAME be of the type COMPANY_NAME, which would enable assigning between the two.

So, comments? Good idea? Insane idea?

I've seen other replies.  It doesn't look like a good idea to me but in any case clarification is needed.  Please give examples of type definitions for, e.g., SNAME and PNAME, preferably using TD-like syntax.  I assume you imagine a relation type definition to be like TD's but with just attribute type names as heading components: REL{SNO, SNAME, CITY

What do you think a value of an attribute type looks like.   Please give a literal denoting the supplier name Smith.

What are the implications for the relational RENAME operator?

Hugh

P.S.  Perhaps more appropriate, what about EXTEND?  In particular, I have a query that involves extension with concatenation of FirstName and LastName (with a blank in between).  How is that done?

Answering once for both of your posts.

I wouldn't necessarily change anything from the TD syntax, if that is the syntax you want, so

TUPLE { S# S#, SNAME NAME, STATUS INTEGER, CITY CHAR }
TUPLE { S# S#('S1'), SNAME NAME('Smith'), STATUS 20, CITY 'London' }
TUPLE { S# S#, SNAME NAME, STATUS INTEGER, CITY CHAR } TUPLE { S# S#('S1'), SNAME NAME('Smith'), STATUS 20, CITY 'London' }
TUPLE { S# S#, SNAME NAME, STATUS INTEGER, CITY CHAR }

TUPLE { S# S#('S1'), SNAME NAME('Smith'), STATUS 20, CITY 'London' }

But what would happen is that we would now automatically also have the types SNAME, STATUS and CITY defined, and the "real" type of the CITY attribute would be CITY, but it could be assigned values of the representation type CHAR.

I believe I answered the RENAME case in my reply to Darren.

As for EXTEND, I suppose that would work similarly in that you would have to explicitly assert the conversion to appropriate types. So FirstName is what? CHAR? And LastName might also be CHAR. So lets look at TD syntax:

EXTEND a ADD (CHAR(FirstName)+' '+CHAR(LastName) AS FullName)
EXTEND a ADD (CHAR(FirstName)+' '+CHAR(LastName) AS FullName)
EXTEND a ADD (CHAR(FirstName)+' '+CHAR(LastName) AS FullName)

If FullName has not been previously defined, there will now be a type "FullName CHAR", and any attribute FullName can be assumed to be of type FullName, without needing specification. But it wouldn't necessarily hurt to respecify "FullName CHAR". However, trying to specify e.g. "FullName INT" would not be allowed anywhere, you would have to forego that option.

Oh, I didn't see this one until after I had responded to Dave Voorhis.  So I didn't misunderstand but you have now clarified.  The question now concerns the scope of this defined-on-the-fly type Fullname.  It has to be local to the expression in which it is defined, otherwise I would object strongly.  And it is is local to the expression, I can't see much point.

Hugh

There wouldn't be much point in having it just local to the expression. I propose that the specific type definitions for each attribute that Dave provided is a "best practice" or at least a "good practice", so we can just let them be automatically defined from the attribute definition. This would lead the developer in the right direction for the price of a slight inconvenience on the rare (?) occasion when you would have wanted an attribute with the same name but of a different type.

Thank you for confirming the pointlessness of a type definition local to the expression in which it is defined.  In that case, this particular aspect of your idea bothers me greatly.

First, it gets into the dodgy are of expression evaluation having side-effects.  That alone is sufficient grounds for rejection afaiacs.

Secondly,  what happens if  "exp1 AS FullName" and "exp2 AS FullName" specifications appear in the same overall relation expression, where exp1 andexp2 re of different types?

Thirdly, what happens if "exp2 AS FullName" appears in a subsequent statement (perhaps a year or so later, if you are really thinking of global scope)?

It is unthinkable, to my mind, that somebody innocently entering an ad hoc query might invalidate somebody else's ad hoc query in this way.

Sorry if points like this have already been raised and possibly addressed.  I know there has been a lot of correspondence but my brief glances have given me the impression that your idea has received some interest and even sympathy in some quarters.  So I might be missing something.

Hugh

Hugh

Would something like this be more palatable?

DICTIONARY;
Customer_ID CHAR;
Customer_Phone INT;
Invoice_Number INT;
Invoice_Date Date;
Amount RATIONAL;
END DICTIONARY;
VAR Customers REAL RELATION {Customer_ID, Customer_Phone} KEY {Customer_ID};
VAR Invoices REAL RELATION {Invoice_Number, Invoice_Date, Customer_ID, Amount} KEY {Invoice_Number};
CONSTRAINT Invoices_Customers_FK Invoices {Customer_ID} ⊆ Customers {Customer_ID};
DICTIONARY; Customer_ID CHAR; Customer_Phone INT; Invoice_Number INT; Invoice_Date Date; Amount RATIONAL; END DICTIONARY; VAR Customers REAL RELATION {Customer_ID, Customer_Phone} KEY {Customer_ID}; VAR Invoices REAL RELATION {Invoice_Number, Invoice_Date, Customer_ID, Amount} KEY {Invoice_Number}; CONSTRAINT Invoices_Customers_FK Invoices {Customer_ID} ⊆ Customers {Customer_ID};
DICTIONARY;
  Customer_ID CHAR;
  Customer_Phone INT;
  Invoice_Number INT;
  Invoice_Date Date;
  Amount RATIONAL;
END DICTIONARY;

VAR Customers REAL RELATION {Customer_ID, Customer_Phone} KEY {Customer_ID};
VAR Invoices REAL RELATION {Invoice_Number, Invoice_Date, Customer_ID, Amount} KEY {Invoice_Number};

CONSTRAINT Invoices_Customers_FK  Invoices {Customer_ID} ⊆ Customers {Customer_ID};

The effect of the DICTIONARY block is to define a data dictionary of attributes, which implicitly creates the following types...

TYPE Customer_ID POSSREP {Value CHAR};
TYPE Customer_Phone POSSREP {Value INT};
TYPE Invoice_Number POSSREP {Value INT};
TYPE Invoice_Date POSSREP {Value Date};
TYPE Amount POSSREP {Value RATIONAL};
TYPE Customer_ID POSSREP {Value CHAR}; TYPE Customer_Phone POSSREP {Value INT}; TYPE Invoice_Number POSSREP {Value INT}; TYPE Invoice_Date POSSREP {Value Date}; TYPE Amount POSSREP {Value RATIONAL};
TYPE Customer_ID POSSREP {Value CHAR};
TYPE Customer_Phone POSSREP {Value INT}; 
TYPE Invoice_Number POSSREP {Value INT}; 
TYPE Invoice_Date POSSREP {Value Date}; 
TYPE Amount POSSREP {Value RATIONAL};

...and allows declarations like

VAR Invoices REAL RELATION {Invoice_Number, Invoice_Date, Customer_ID, Amount} KEY {Invoice_Number};

VAR Invoices REAL RELATION {Invoice_Number, Invoice_Date, Customer_ID, Amount} KEY {Invoice_Number}; to be shorthand for:

VAR Invoices REAL RELATION {Invoice_Number Invoice_Number, Invoice_Date Invoice_Date, Customer_ID Customer_ID, Amount Amount} KEY {Invoice_Number};
VAR Invoices REAL RELATION {Invoice_Number Invoice_Number, Invoice_Date Invoice_Date, Customer_ID Customer_ID, Amount Amount} KEY {Invoice_Number};
VAR Invoices REAL RELATION {Invoice_Number Invoice_Number, Invoice_Date Invoice_Date, Customer_ID Customer_ID, Amount Amount} KEY {Invoice_Number};

This allows the compiler to check that we're not going to accidently multiply a Customer_Phone by an Invoice_Number, etc.

Ideally, any declaration of the form <identifier> <type> should be replaceable with <dictionary_name>, e.g. this...

VAR Invoice_Number INIT(33);
VAR Invoice_Number INIT(33);
VAR Invoice_Number INIT(33);

...is the same as this:

VAR Invoice_Number Invoice_Number INIT(33);
VAR Invoice_Number Invoice_Number INIT(33);
VAR Invoice_Number Invoice_Number INIT(33);

I imagine it would an entirely optional feature; no existing Tutorial D code would be broken by adding the facility, and anyone who doesn't want to use DICTIONARY and the <dictionary_name> shorthands for <identifier> <type> could ignore them.

This is a bit too much to digest.  It would be better not to include all these shorthands to begin with (such as omitting the type on a VAR declaration to make it default to the variable name (or are we omitting the variable name and making it default to the type name?)).

Actually, the whole reason for having the DICTIONARY ... END DICTIONARY declaration and being able to use <name> <type> --> <dictionary name> shorthands is because the shorthands are desirable. Of course, you can do everything in Tutorial D without the DICTIONARY ... END DICTIONARY declaration or being able to use <name> <type> --> <dictionary name> shorthands...

Or, the <name> <type> --> <dictionary name> shorthands could be removed and just have DICTIONARY, but it would mean a lot of unnecessarily repetitive declarations like:

VAR Invoices REAL RELATION {Invoice_Number Invoice_Number, Invoice_Date Invoice_Date, Customer_ID Customer_ID, Amount Amount} KEY {Invoice_Number};
VAR Invoices REAL RELATION {Invoice_Number Invoice_Number, Invoice_Date Invoice_Date, Customer_ID Customer_ID, Amount Amount} KEY {Invoice_Number};
VAR Invoices REAL RELATION {Invoice_Number Invoice_Number, Invoice_Date Invoice_Date, Customer_ID Customer_ID, Amount Amount} KEY {Invoice_Number};

Anyway, it doesn't seem anything like Tobega's idea.  Is it solving or addressing the same problem(s) as Tobega's?

It is notionally addressing the same problem as Tobega's, but taking a different approach that also addresses a desire for data dictionaries, mentioned elsewhere in this thread.

I conclude from the examples that dictionary element names are independent from attributes of headings.  That's okay, but you do call a dictionary a dictionary of attributes whereas VAR Invoice_Number Invoice_Number INIT(33) shows Invoice_Number be used for something other than an attribute.

Dictionary element names define identifier type pairs. They could be restricted to attributes of headings, but it might seem arbitrarily restrictive to allow a dictionary name to be used in place of identifier type in some places but not others.

Also, it seems that I can assign and integer to an Invoice_Number variable, and I can compare an Invoice_Number values with an integer, but you don't mention assigning/comparing between Invoice_Number and Amount variables and values.  I believe you don't intend those to be legal.  What about arithmetic operations?  Can I subtract an Amount from an Invoice_Number?  Can I concatenate a Firstname with a blank and a LastName?

Assigning an integer to an Invoice_Number variable was a careless mistake. It should have been...

VAR Invoice_Number INIT(Invoice_Number(33));
VAR Invoice_Number INIT(Invoice_Number(33));
VAR Invoice_Number INIT(Invoice_Number(33));

...which is equivalent to:

VAR Invoice_Number Invoice_Number INIT(Invoice_Number(33));
VAR Invoice_Number Invoice_Number INIT(Invoice_Number(33));
VAR Invoice_Number Invoice_Number INIT(Invoice_Number(33));

Am I right in assume there is no effect on RENAME and EXTEND as presently defined in TD?

It has no effect on RENAME, EXTEND, or anything else, except where <name> <type> can currently be used to declare a variable, parameter, or heading attribute (have I missed anything?) you could also use <dictionary_name> instead of <name> <type> -- assuming there is a dictionary entry of the form <dictionary_name> <type>, which automatically creates TYPE <dictionary_name> POSSREP {Value <type>} -- so that using <dictionary_name> in a variable, parameter, or heading attribute declaration instead of <name> <type> is shorthand for <dictionary_name> <dictionary_name>.

E.g., given:

DICTIONARY;
X CHAR;
END DICTIONARY;
DICTIONARY; X CHAR; END DICTIONARY;
DICTIONARY;
  X CHAR;
END DICTIONARY;

The following will be implicitly created:

TYPE X POSSREP {Value CHAR}
TYPE X POSSREP {Value CHAR}
TYPE X POSSREP {Value CHAR}

[Addendum: I wonder if it might be useful to be able to optionally specify in the DICTIONARY section those elements that are not to be wrapped and are to be defined as the specified type. E.g., something like 'Customer_ID INT UNWRAP' means that rather than automatically creating TYPE Customer_ID POSSREP {Value INT} and declaring attributes/variables/parameters named Customer_ID as type Customer_ID, they'd be declared to be type INT.]

And a declaration like...

VAR X;
VAR X;
VAR X;

...is shorthand for:

VAR X X;
VAR X X;
VAR X X;

In short, it provides the "classic" features of a data dictionary, with added type safety.

Quote from Hugh on May 1, 2021, 3:16 pm
Quote from Dave Voorhis on April 30, 2021, 2:44 pm

Thinking a bit more on 'DICTIONARY', etc., maybe it would be desirable to be able to do this:

DICTIONARY;
Customer_ID CHAR;
Customer_Phone, Customer_Phone2 INT;
Invoice_Number INT;
Invoice_Date Date;
Amount RATIONAL;
END DICTIONARY;
DICTIONARY; Customer_ID CHAR; Customer_Phone, Customer_Phone2 INT; Invoice_Number INT; Invoice_Date Date; Amount RATIONAL; END DICTIONARY;
DICTIONARY;
  Customer_ID CHAR;
  Customer_Phone, Customer_Phone2 INT;
  Invoice_Number INT;
  Invoice_Date Date;
  Amount RATIONAL;
END DICTIONARY;

So you can say this...

VAR Customers REAL RELATION {Customer_ID, Customer_Phone, Customer_Phone2} KEY {Customer_ID};
VAR Customers REAL RELATION {Customer_ID, Customer_Phone, Customer_Phone2} KEY {Customer_ID};
VAR Customers REAL RELATION {Customer_ID, Customer_Phone, Customer_Phone2} KEY {Customer_ID};

...which is shorthand for:

VAR Customers REAL RELATION {Customer_ID Customer_ID, Customer_Phone Customer_Phone, Customer_Phone2 Customer_Phone} KEY {Customer_ID};
VAR Customers REAL RELATION {Customer_ID Customer_ID, Customer_Phone Customer_Phone, Customer_Phone2 Customer_Phone} KEY {Customer_ID};
VAR Customers REAL RELATION {Customer_ID Customer_ID, Customer_Phone Customer_Phone, Customer_Phone2 Customer_Phone} KEY {Customer_ID};

Yet another shorthand given in advance of acceptance of the base idea.  We have to be sure that works before considering this addition.

I assume you mean that multiple element names on the same dictionary element are just synonyms.  Right?  If so, I'm reminded that synonyms sometimes give rise to problems, so I think this one might need more thought.

If the DICTIONARY idea is acceptable, it's arguably necessary rather than being an (optional?) addition.

I would definitely not describe multiple element names on the same dictionary element as synonyms. Multiple element names on the same dictionary element define distinct dictionary elements of the same type.

It allows you to declare multiple attributes in a relvar to have the same type using DICTIONARY entries to specify them, which you otherwise couldn't do using DICTIONARY entries.

I misunderstood your use of multiple elements names on the same dictionary element because your example gives two distinct dictionary elements using the same underlying type CHAR.

Given a DICTIONARY like...

DICTIONARY;
  Customer_ID, Customer_ID2 INT;
  Invoice_Number INT;
END DICTIONARY;

...it specifies that Customer_ID and Customer_ID2 are both of type Customer_ID which wraps an INTEGER, and are type compatible.

Customer_ID and Invoice_Number are distinct types, both of which wrap an INTEGER. They are not type compatible.

I wouldn't object to any genuine shorthand, though I might have an opinion on its value.  It's difficult for me to judge the wisdom of this one because (a) I don't have a full understanding of the problem it seeks to address (saying it's the same as Tobega's doesn't help me),

The fundamental problem it seeks to address is avoiding type compatibility which accidentally results in error -- things like inadvertently JOINing an invoice ID and a product ID, because they're both named ID and have the same INTEGER type.

Apparently, this sort of thing is quite common in the SQL world, particularly when working with large and relatively unfamiliar schemas, like those in commercial bought-in products.

and (b) the extent to which you are addressing it isn't clear either.

If the DICTIONARY facility is used, each new entry is a unique type (unless explicitly declared otherwise.) Thus, inadvertent type compatibility issues are virtually eliminated. There is also value in having a data dictionary that identifies every data element / attribute, but that's a separate benefit.

You provide a shorthand for defining types that have a single possrep with a single possrep component, without defining any operators in addition to those systematically implied  by the type definition (such as THE_Value(Customer_Id)).  That tells me that solutions for the perceived problem are already available in TD as defined.  Do dictionaries offer any additional advantages?

Yes.

It uses an explicit data dictionary, which is of value for clearly identifying every possible data element / attribute.

It reduces verbosity, and simplifies gaining safety.

And, yes, you can do everything in Tutorial D as currently defined without the DICTIONARY facility. In fact, I often use it that way in Rel. But it's verbose, and you don't get the benefits of an explicit data dictionary.

Regarding the perceived problem, is it really just concerned with ill-advised comparisons, especially those that can arise "by accident", being implicitly involved in operations such as join?

That, and ill-advised mathematical operations on numbers, and ill-advised concatenation of unrelated (i.e., different type) strings, and so on.

In other words, it gives you all the benefits you gain from type safety in general. It's simply a means to encourage type safety where, arguably, type safety should be encouraged. We really shouldn't be treating a Customer ID and a Product ID as the same type, because they're not the same type, even though it's reasonable for both to be based on an integer or a string.

You might want to outlaw taking averages of part numbers too,

Yes, and my approach implicitly outlaws it.

but if anybody really wants to go out of their way do that they probably do have some good reason!

If they have a good reason to do it, they can. This isn't allowed:

AVG(Customers, Customer_ID)

But this is allowed:

AVG(Customers, THE_Value(Customer_ID))

It has the added benefit of making it explicit what you're doing.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
PreviousPage 4 of 5Next