The Forum for Discussion about The Third Manifesto and Related Matters

Please or Register to create posts and topics.

'Phantom type' for Attribute-Value pairs, in "popular languages" [was C# as a D]

Page 1 of 3Next
Quote from Dave Voorhis on June 16, 2020, 8:09 am

....

Having to forgo static typing and rely on a distinct language -- whether a "heading sublanguage" like your "S#,SNAME" string argument or <<S#,SNAME>> meta-language -- seems to be the standard and inevitable approach for TTM-compliant relational libraries. That's certainly what Rel's underlying relational library does. Though the meta-language in Rel's case is a full separate language, the underlying relational library is notionally very similar to yours and is no doubt similar to that in SIRA_PRISE, Duro, CsiDB, and so on.

That's what we all run into when implementing TTM constructs using usual popular programming languages like C#, Java, and C++.

Retaining static type guarantees in the host language without relying on meta-languages or sub-languages seems to inevitably require sacrificing or significantly changing the TTM operators.

 

Do these "usual popular languages" not have support for 'first-class phantom types'? The classic paper, but sorry I can't find better refs. Those refs are very much tied to Functional Programming, but I don't see there's anything FP-specific in phantom types. The reason for saying 'phantom' is that there's no value represented by the type: it's purely a type-level tag. Trying to put that in various ways in TTM terms:

  • A phantom type is a tagged union with only one possible tag; or
  • A phantom type is a named type whose set-of-values [RM Pre 1] is empty; or
  • A phantom type is a type whose set-of-values is a singleton, and because there's only one possible value it doesn't need a PhysRep.

In TUPLE{..., PNAME 'Delft', CITY 'Delft', ...}, those two attribute types are CHAR and have the same value; they could possibly have the same PhysRep. But we want to say they're distinct attributes. The PhysRep doesn't need to represent the attribute name, providing the type system can keep track of the attribute within the PhysRep for the TUPLE{ }.

In particular within a relation value, we don't want to make provision for a PhysRep for attribute names in tuples, because every tuple has the same set of attribute names and types for those attributes, and that's given by the relation's type.

In a language that supports phantom types, you could represent <A, T, v> triples at the type level

  • either as an ordered pair (_ :: A, v :: T), in which the :: introduces a type annotation, and _ is a don't care/no value here; or
  • (preferably) as A{ v :: T}, in which the A{ } is a TTM type generator, with the A being a type-level tag. There's no need for A{ } to be a TTM Selector, although it does no harm to think of it as such.

Phantom types sit awkwardly amongst the TTM menagerie; it might be easier to categorise them as non-scalar.

If we can statically represent attribute names as phantoms, I have a possible approach for static type guarantees with the TTM relational operators -- but I suspect it needs type-level manipulations beyond the popular languages.

Quote from AntC on June 19, 2020, 5:56 am
Quote from Dave Voorhis on June 16, 2020, 8:09 am

....

Having to forgo static typing and rely on a distinct language -- whether a "heading sublanguage" like your "S#,SNAME" string argument or <<S#,SNAME>> meta-language -- seems to be the standard and inevitable approach for TTM-compliant relational libraries. That's certainly what Rel's underlying relational library does. Though the meta-language in Rel's case is a full separate language, the underlying relational library is notionally very similar to yours and is no doubt similar to that in SIRA_PRISE, Duro, CsiDB, and so on.

That's what we all run into when implementing TTM constructs using usual popular programming languages like C#, Java, and C++.

Retaining static type guarantees in the host language without relying on meta-languages or sub-languages seems to inevitably require sacrificing or significantly changing the TTM operators.

 

Do these "usual popular languages" not have support for 'first-class phantom types'? The classic paper, but sorry I can't find better refs. Those refs are very much tied to Functional Programming, but I don't see there's anything FP-specific in phantom types. The reason for saying 'phantom' is that there's no value represented by the type: it's purely a type-level tag. Trying to put that in various ways in TTM terms:

  • A phantom type is a tagged union with only one possible tag; or
  • A phantom type is a named type whose set-of-values [RM Pre 1] is empty; or
  • A phantom type is a type whose set-of-values is a singleton, and because there's only one possible value it doesn't need a PhysRep.

In TUPLE{..., PNAME 'Delft', CITY 'Delft', ...}, those two attribute types are CHAR and have the same value; they could possibly have the same PhysRep. But we want to say they're distinct attributes. The PhysRep doesn't need to represent the attribute name, providing the type system can keep track of the attribute within the PhysRep for the TUPLE{ }.

In particular within a relation value, we don't want to make provision for a PhysRep for attribute names in tuples, because every tuple has the same set of attribute names and types for those attributes, and that's given by the relation's type.

In a language that supports phantom types, you could represent <A, T, v> triples at the type level

  • either as an ordered pair (_ :: A, v :: T), in which the :: introduces a type annotation, and _ is a don't care/no value here; or
  • (preferably) as A{ v :: T}, in which the A{ } is a TTM type generator, with the A being a type-level tag. There's no need for A{ } to be a TTM Selector, although it does no harm to think of it as such.

Phantom types sit awkwardly amongst the TTM menagerie; it might be easier to categorise them as non-scalar.

If we can statically represent attribute names as phantoms, I have a possible approach for static type guarantees with the TTM relational operators -- but I suspect it needs type-level manipulations beyond the popular languages.

Probably, and there's no equivalent to phantom types -- at least as far as I know -- in any of the usual popular programming languages, i.e., C, C#, C++, Java, or Python.

It is possible to define types that are notionally "empty" -- interfaces and classes with no members -- but as I understand it (and them), that wouldn't help here.

The closest I've seen to supporting statically-typed parameters on the kinds of relational operators used in TTM involves use (or abuse) of templates, where a given category of template is defined with one parameter, two parameters, three parameters, four parameters and so on up to some hopefully-larger-than-you'll-ever-need number of parameters. This is used to support, say, statically-typed tuples of arbitrary degree.

Unfortunately, said "arbitrary" is usually forced -- perhaps by language implementation restrictions, and perhaps by practicality -- to mean rather restrictive limits, like at most 16.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
Quote from Dave Voorhis on June 19, 2020, 5:35 pm

It is possible to define types that are notionally "empty" -- interfaces and classes with no members --

That must be manifestly wrong as an answer to the question asked.

Given the equivalences as the TTM authors believe they should be, based on their "blunders" talk, "types that are notionally empty" should be held equivalent to "interfaces and classes with no ***INSTANCES***".

With no ***possible*** instances, that is.

"Popular languages" that I am familiar with, do support "singleton types", but the concept comes nowhere near to the Op's use case of "avoiding the need for a physrep for an attribute name".

Of which I continue to wonder whether it is a legit example of FP's "LPC conflation".  Avoiding having to encode attribute names in the physrep of tuple values is/may be/... a legitimate concern.  Note the "phys" and note how it makes the problem one that exists at the level of "phys" implementation.  Trying to address it by [providing the ability to] defining a "suitable" type (which is something that ***should*** be done only to address concerns at the ***logical*** level) seems very much indeed like solving a genuine real legit [phys-level !!!] problem at the wrong level of abstraction.

Quote from Erwin on June 19, 2020, 7:13 pm
Quote from Dave Voorhis on June 19, 2020, 5:35 pm

It is possible to define types that are notionally "empty" -- interfaces and classes with no members --

That must be manifestly wrong as an answer to the question asked.

Given the equivalences as the TTM authors believe they should be, based on their "blunders" talk, "types that are notionally empty" should be held equivalent to "interfaces and classes with no ***INSTANCES***".

With no ***possible*** instances, that is.

I guess... Though I was only riffing on the term "empty."

There is really no comparison between TTM empty types and classes/interfaces, or at least it's like comparing apples and oranges. Or, maybe more like comparing apples and basketballs.

Or something. The best answer to the original question is almost certainly "no."

I'm not clear what physical representations have to do with it.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org

Do these "usual popular languages" not have support for 'first-class phantom types'? The classic paper, but sorry I can't find better refs. Those refs are very much tied to Functional Programming, but I don't see there's anything FP-specific in phantom types. The reason for saying 'phantom' is that there's no value represented by the type: it's purely a type-level tag. Trying to put that in various ways in TTM terms:

In the languages I know (a) types are not first class (b) the choices are few: UDTs are basically classes with a few variants like interfaces (and enums in Java).

But if an attribute name is a class and inheritance is available then maybe:

interface Attribute {}
abstract class AString : String,Attribute {}
abstract class AInteger : Integer,Attribute {}

class S_ : AString {}
class SNAME : AString {}
class STATUS : AInteger {}
class CITY : AString {}
class P_ : AString {}
class COLOR_ : AString {}


Show("Q2. Get suppliers names who supply at least one red part.",
   S.Project<S_,SNAME>()
    .Join(SP.Project<S_,P_>())
    .Join(P.Project(<P_,COLOR>())
    .Restrict<COLOR>(v => v == "Red")
    .Project<SNAME>();

This code is workable. I haven't through through all the details.

  • A phantom type is a tagged union with only one possible tag; or
  • A phantom type is a named type whose set-of-values [RM Pre 1] is empty; or
  • A phantom type is a type whose set-of-values is a singleton, and because there's only one possible value it doesn't need a PhysRep.

Abstract classes have no values.

In TUPLE{..., PNAME 'Delft', CITY 'Delft', ...}, those two attribute types are CHAR and have the same value; they could possibly have the same PhysRep. But we want to say they're distinct attributes. The PhysRep doesn't need to represent the attribute name, providing the type system can keep track of the attribute within the PhysRep for the TUPLE{ }.

If the names are types, this works just fine.

TUPLE<PNAME,CITY>('Delft', 'Delft')

In particular within a relation value, we don't want to make provision for a PhysRep for attribute names in tuples, because every tuple has the same set of attribute names and types for those attributes, and that's given by the relation's type.

The relation and tuple share the same heading; the heading is a set of attributes; each attribute has a name and underlying type. Despite TTM it's not necessary that a relation ISA unique type, as long has it HASA unique heading. Individual relation/tuple types just don't seem to serve any useful purpose, except as a kind of holder for a heading.

In a language that supports phantom types, you could represent <A, T, v> triples at the type level

  • either as an ordered pair (_ :: A, v :: T), in which the :: introduces a type annotation, and _ is a don't care/no value here; or
  • (preferably) as A{ v :: T}, in which the A{ } is a TTM type generator, with the A being a type-level tag. There's no need for A{ } to be a TTM Selector, although it does no harm to think of it as such.

Phantom types sit awkwardly amongst the TTM menagerie; it might be easier to categorise them as non-scalar.

Headings/attributes are indeterminate non-types in TTM; you might just as well express them as phantom types without making this choice.

If we can statically represent attribute names as phantoms, I have a possible approach for static type guarantees with the TTM relational operators -- but I suspect it needs type-level manipulations beyond the popular languages.

It does, but potentially can sit within the existing syntax and need only modest compiler enhancement/pre-processor to implement type safety.

Andl - A New Database Language - andl.org
Quote from Erwin on June 19, 2020, 7:13 pm
Quote from Dave Voorhis on June 19, 2020, 5:35 pm

It is possible to define types that are notionally "empty" -- interfaces and classes with no members --

That must be manifestly wrong as an answer to the question asked.

Given the equivalences as the TTM authors believe they should be, based on their "blunders" talk, "types that are notionally empty" should be held equivalent to "interfaces and classes with no ***INSTANCES***".

With no ***possible*** instances, that is.

"Popular languages" that I am familiar with, do support "singleton types", but the concept comes nowhere near to the Op's use case of "avoiding the need for a physrep for an attribute name".

Of which I continue to wonder whether it is a legit example of FP's "LPC conflation".  Avoiding having to encode attribute names in the physrep of tuple values is/may be/... a legitimate concern.  Note the "phys" and note how it makes the problem one that exists at the level of "phys" implementation.  Trying to address it by [providing the ability to] defining a "suitable" type (which is something that ***should*** be done only to address concerns at the ***logical*** level) seems very much indeed like solving a genuine real legit [phys-level !!!] problem at the wrong level of abstraction.

Thank you both. I was expecting the answer 'no', so I was trying to explain the concept in as broad terms as possible, expecting it would be unfamiliar. Perhaps I overstretched.

The logical role of a phantom type is clear: it's a type or a component of a type that appears only at type level, there's no term-level expression that 'has' that type. I mentioned PhysRep to adapt the explanation to a TTM context. Never the less I don't see (as Dave pointed out in a later message) why the lack of a physical rep causes a "problem". PhysReps (whether or not they're present) are an implementation issue.

So to continue the TTM line of thought: PhysReps are mentioned wrt scalar types (every scalar value must have some PhysRep); but not wrt non-scalars. The type of a non-scalar is generated by TUP{ } or REL{ } whose type machinery whereof TTM is silent. Attribute names and <A, T>, <A, T, v> pairs and triples can appear only within those type generators/value constructs, and of course we do not prescribe syntax.

I presume the reason for the reticence is to allow an implementation to choose a PhysRep for a non-scalar that doesn't have an identifiable component representing Attribute names(?)

So a Tutorial D/IM question:

  • Is it allowed to form a UNION type with only one type element?
  • Presumably that UNION type's name is distinct from the name of its based-on type(?)
  • Presumably the type still needs a tag to Select values of the UNION, even though all such values have MST of the based-on type(?)
  • What is the type of that tag/Selector?
  • Can I form a UNION type name CITY with Selector CITY that selects a CHAR? (How) is CITY('Delft') (or however you write it) distinguished from plain CHAR 'Delft'?
Quote from AntC on June 20, 2020, 2:58 am

So a Tutorial D/IM question:

  • Is it allowed to form a UNION type with only one type element?
  • Presumably that UNION type's name is distinct from the name of its based-on type(?)
  • Presumably the type still needs a tag to Select values of the UNION, even though all such values have MST of the based-on type(?)
  • What is the type of that tag/Selector?
  • Can I form a UNION type name CITY with Selector CITY that selects a CHAR? (How) is CITY('Delft') (or however you write it) distinguished from plain CHAR 'Delft'?

Re a UNION type with only one type element... You mean this?

TYPE Glub UNION;
TYPE Nada IS {Glub POSSREP {}};

TYPE_OF(Nada()) is Nada, i.e., the MST.

This is valid, so the Nada is a Glub:

VAR g Glub;
g := Nada();

I'm not clear what you're asking in your last bullet point.

 

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
Quote from Dave Voorhis on June 20, 2020, 11:14 am

I'm not clear what you're asking in your last bullet point.

Do you mean this?

TYPE Glob UNION;
TYPE City IS {Glob POSSREP {c CHAR}};

So this:

WRITELN City('Delft');
WRITELN 'Delft';
WRITELN TYPE_OF(City('Delft'));
WRITELN TYPE_OF('Delft');

City("Delft")
Delft
Scalar("City")
Scalar("CHARACTER")

Ok.

 

 

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
Quote from Dave Voorhis on June 20, 2020, 11:19 am
Quote from Dave Voorhis on June 20, 2020, 11:14 am

I'm not clear what you're asking in your last bullet point.

Do you mean this?

TYPE Glob UNION;
TYPE City IS {Glob POSSREP {c CHAR}};
TYPE Glob UNION; TYPE City IS {Glob POSSREP {c CHAR}};
TYPE Glob UNION;
TYPE City IS {Glob POSSREP {c CHAR}};

So this:

WRITELN City('Delft');
WRITELN 'Delft';
WRITELN TYPE_OF(City('Delft'));
WRITELN TYPE_OF('Delft');
City("Delft")
Delft
Scalar("City")
Scalar("CHARACTER")
Ok.
WRITELN City('Delft'); WRITELN 'Delft'; WRITELN TYPE_OF(City('Delft')); WRITELN TYPE_OF('Delft'); City("Delft") Delft Scalar("City") Scalar("CHARACTER") Ok.
WRITELN City('Delft');
WRITELN 'Delft';
WRITELN TYPE_OF(City('Delft'));
WRITELN TYPE_OF('Delft');

City("Delft")
Delft
Scalar("City")
Scalar("CHARACTER")

Ok.

 

 

Thanks Dave, yes that's kinda along the right lines. I'm puzzled what Glob is doing. I want only one based-on type (viz. CHAR) in the UNION, so could I name the UNION type same as the Selector? Em I think no, because TYPE City ... uses that name. In which case what is TYPE City ...  doing that Glob isn't doing? Can't I go: TYPE CHAR is Glob; ?

I thought the syntax for UNION was to follow it immediately by a embraced commalist of types in the union(?) Or is the syntax you use equivalent?

Presuming there's a similar definition for type PName also based on CHAR but distinct from City and a function/operator Country_of( ) whose argument must be City, I want

VAR c City;                             // I think has to be type Glob?
c := City('Delft');
c := PName('Delft');                    // rejected: ill-typed
c := 'Delft';                           // i.e. CHAR rejected: ill typed

cc := Country_of(c)                     // cc == 'Netherlands'
cc := Country_of(PName('Delft'));       // rejected: ill-typed
cc := Country_of('Delft');              // rejected: ill-typed

pc := Product_category(PName('Delft'))  // pc == 'Tableware'
pc := Product_category(City('Delft'))   // rejected: ill-typed

 

Quote from AntC on June 21, 2020, 3:11 am
Quote from Dave Voorhis on June 20, 2020, 11:19 am
Quote from Dave Voorhis on June 20, 2020, 11:14 am

I'm not clear what you're asking in your last bullet point.

Do you mean this?

TYPE Glob UNION;
TYPE City IS {Glob POSSREP {c CHAR}};
TYPE Glob UNION; TYPE City IS {Glob POSSREP {c CHAR}};
TYPE Glob UNION;
TYPE City IS {Glob POSSREP {c CHAR}};

So this:

WRITELN City('Delft');
WRITELN 'Delft';
WRITELN TYPE_OF(City('Delft'));
WRITELN TYPE_OF('Delft');
City("Delft")
Delft
Scalar("City")
Scalar("CHARACTER")
Ok.
WRITELN City('Delft'); WRITELN 'Delft'; WRITELN TYPE_OF(City('Delft')); WRITELN TYPE_OF('Delft'); City("Delft") Delft Scalar("City") Scalar("CHARACTER") Ok.
WRITELN City('Delft');
WRITELN 'Delft';
WRITELN TYPE_OF(City('Delft'));
WRITELN TYPE_OF('Delft');

City("Delft")
Delft
Scalar("City")
Scalar("CHARACTER")

Ok.

 

 

Thanks Dave, yes that's kinda along the right lines. I'm puzzled what Glob is doing. I want only one based-on type (viz. CHAR) in the UNION, so could I name the UNION type same as the Selector? Em I think no, because TYPE City ... uses that name. In which case what is TYPE City ...  doing that Glob isn't doing? Can't I go: TYPE CHAR is Glob; ?

No, you can't.

I don't think there's any use in a UNION type with a single element. There doesn't appear to be any point in having Glob, and for practical purposes I would just define TYPE City POSSREP {c CHAR}.

I was merely trying to implement what you described, but perhaps I misunderstood something.

I thought the syntax for UNION was to follow it immediately by a embraced commalist of types in the union(?) Or is the syntax you use equivalent?

No, the Tutorial D UNION has, to my knowledge, never used an embraced commalist of types in the union.

A more realistic application might be something like this:

TYPE TemperatureReading UNION;
TYPE TemperatureReading_Missing {IS TemperatureReading POSSREP {}};
TYPE TemperatureReading_Failed {IS TemperatureReading POSSREP {reason CHAR}};
TYPE TemperatureReading_Temperature {IS TemperatureReading POSSREP {temp RATIONAL}};

 

Presuming there's a similar definition for type PName also based on CHAR but distinct from City and a function/operator Country_of( ) whose argument must be City, I want

VAR c City; // I think has to be type Glob?
c := City('Delft');
c := PName('Delft'); // rejected: ill-typed
c := 'Delft'; // i.e. CHAR rejected: ill typed
cc := Country_of(c) // cc == 'Netherlands'
cc := Country_of(PName('Delft')); // rejected: ill-typed
cc := Country_of('Delft'); // rejected: ill-typed
pc := Product_category(PName('Delft')) // pc == 'Tableware'
pc := Product_category(City('Delft')) // rejected: ill-typed
VAR c City; // I think has to be type Glob? c := City('Delft'); c := PName('Delft'); // rejected: ill-typed c := 'Delft'; // i.e. CHAR rejected: ill typed cc := Country_of(c) // cc == 'Netherlands' cc := Country_of(PName('Delft')); // rejected: ill-typed cc := Country_of('Delft'); // rejected: ill-typed pc := Product_category(PName('Delft')) // pc == 'Tableware' pc := Product_category(City('Delft')) // rejected: ill-typed
VAR c City;                             // I think has to be type Glob?
c := City('Delft');
c := PName('Delft');                    // rejected: ill-typed
c := 'Delft';                           // i.e. CHAR rejected: ill typed

cc := Country_of(c)                     // cc == 'Netherlands'
cc := Country_of(PName('Delft'));       // rejected: ill-typed
cc := Country_of('Delft');              // rejected: ill-typed

pc := Product_category(PName('Delft'))  // pc == 'Tableware'
pc := Product_category(City('Delft'))   // rejected: ill-typed

 

Sorry, I'm not following this but haven't looked closely because I'm in a hurry to go out. Will look again when I return.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
Page 1 of 3Next