First and second class citizens in TTM
Quote from AntC on June 27, 2019, 11:37 amQuote from Dave Voorhis on June 27, 2019, 7:01 amQuote from AntC on June 26, 2019, 10:59 pmQuote from Hugh on June 26, 2019, 11:08 amin its support for UDT definitions. So do your restricted TUPLE{...} expressions. How do you get off the ground, so to speak, without any system-defined scalar types?
...
Types "get of the ground " because type declarations introduce both the type name and names for all the member values. The full generality of type declarations allows building types from already-defined types (as with Tutorial D user-defined types), but you need to start with the boot laces.
data Bool = False | True;
- data Bool = False | True;
data Bool = False | True;
- data Bool = False | True;
data Bool = False | True;is the (usual/library-supplied) declaration for type-name
Bool
(the definiens) with member valuesFalse, True
(sequenced that way round so thatFalse
evaluates to less thanTrue
). The keyworddata
(which you've complained about before: why nottype
?) is becauseFalse, True
are data values. (Other declarations declare for example type aliases, which don't mention data values.)The
|
separator between the member values is introducing a 'sum type' aka 'tagged union', as in sum-of-product types. It is (theoretically) how all base types are defineddata Int = -9223372036854775808 | ... | -2 | -1 | 0 | 1 | 2 | ... | 9223372036854775807 data Char = '\NUL' | ... | ' ' | 'A' | 'B' | ... | 'a' | 'b' | ... '\1114111'
- data Int = -9223372036854775808 | ... | -2 | -1 | 0 | 1 | 2 | ... | 9223372036854775807
- data Char = '\NUL' | ... | ' ' | 'A' | 'B' | ... | 'a' | 'b' | ... '\1114111'
data Int = -9223372036854775808 | ... | -2 | -1 | 0 | 1 | 2 | ... | 9223372036854775807 data Char = '\NUL' | ... | ' ' | 'A' | 'B' | ... | 'a' | 'b' | ... '\1114111'
- data Int = -9223372036854775808 | ... | -2 | -1 | 0 | 1 | 2 | ... | 9223372036854775807
- data Char = '\NUL' | ... | ' ' | 'A' | 'B' | ... | 'a' | 'b' | ... '\1114111'
If the above is "(theoretically) how all base types are defined", how are base types like Int and Char actually defined?
I've described how the semantics of Haskell is presented to learners. I'm sure there is some simplification for tutorial purposes. And there has to be some eliding the truth to yank on the bootstraps. At least, that definition for
Bool
you can see verbatim in the languagePrelude
that is, the standard/usual library that comes with every compiler.Is it compiler magic (i.e., baked into the compiler), or do definitions like the above for Int and Char live somewhere in the Haskell standard library (or equivalent)?
The language standard is clearly split into two parts: 1) the formal syntax and semantics; 2) the standard libraries (which include declarations for
Bool
,Char, Int, Float
and the usually expected numerical operations, including comparisons yieldingBool
). Every compiler is expected to support/implement the standard libraries, even if some program chooses not to import them (or not them all). Furthermore in practice every compiler takes advantage of seeing library-defined types to generate more efficient machine code.The syntax for
Int, Float, Char (and String), tuples
(and a few other base types) is not valid for user-defined values (which must be simple names). So yes there is compiler magic to recognise that special syntax and turn it into internal representations of those values. But from then on, those values behave as if user-defined. A (non-parameterised/non-polymorphic) type is just a set of values, per RM Pre 1; and each possrep must belong to exactly one type, per RM Pre 2.Either way, the answer to Hugh's question ("How do you get off the ground, so to speak, without any system-defined scalar types?") appears to be that numeric and character literals are special, baked into the compiler, and at least (if it's the latter case) predefined (by the compiler) to be notionally "typeful" to the extent that, for example, -9223372036854775808 | ... | -2 is recognised to represent a range of ordinal numeric values.
Or is that not how it works?
Two things: a) the range of type
Int
(in the standard library) is implementation-dependent. That's the range on my machine (64-bits). The language standard requires support for at least 31-bits. The range of typeInteger
(in the standard library) is arbitrary precision (to some IEEE standard/using a C library implementation). Similarly theChar
encoding is implementation-dependent (expected to be at least UTF-8 compliant).But b) no it's not 'baked in' , in the sense if you really, really want to build your own model of numeric types, the compiler will help you do that. (And that is a realistic use case: people use their own representations if they want more precision than
Int
orFloat, Double
but better efficiency than IEEE standards -- for example to do fancy array manipulations.) The key thing is thatfromIntegral
I mentioned (and mis-spelt); there's also afromRational
. Those convert from a secret/implementation-defined numeric format to your custom format. You must supply definitions/overloadings for those two functions, as well as declaring your numeric types. (And probably those declarations will be machine-level, so you're using some escape-hatch 'Foreign Function Interface' to get outside official Haskell.)Token
1234
appearing in a program is syntactic sugar forfromIntegral 1234
. So the token1234
must appear with a type annotation or typeful context that gives the return type of the expression, in order for the compiler to resolve which overloading of methodfromIntegral
to apply. These are all valid sourcecodex = 1234 :: Int -- type annotation on the value, then 1234.0 not valid y = 1234 :: Float -- could be written 1234.0 z :: Double -- type annotation on the variable z = 1234 w = sqrt z -- type of w inferred from function sqrt applied to arg zBut I didn't say: in
fromIntegral 1234
, what's the type of that1234
. Again it's determined from the typeful context. Although methodfromIntegral
is polymorphic/overloaded in its return type, its argument type is compiler-determined (and a user library can't override that).If the explanation above is still not technical enough, you've probably exhausted the depths of my understanding. (As an end-user, I'm happy to accept the tutorial explanation, whilst being aware it's something of a fairy-story.)
Quote from Dave Voorhis on June 27, 2019, 7:01 amQuote from AntC on June 26, 2019, 10:59 pmQuote from Hugh on June 26, 2019, 11:08 amin its support for UDT definitions. So do your restricted TUPLE{...} expressions. How do you get off the ground, so to speak, without any system-defined scalar types?
...
Types "get of the ground " because type declarations introduce both the type name and names for all the member values. The full generality of type declarations allows building types from already-defined types (as with Tutorial D user-defined types), but you need to start with the boot laces.
data Bool = False | True;
- data Bool = False | True;
data Bool = False | True;
- data Bool = False | True;
data Bool = False | True;is the (usual/library-supplied) declaration for type-name
Bool
(the definiens) with member valuesFalse, True
(sequenced that way round so thatFalse
evaluates to less thanTrue
). The keyworddata
(which you've complained about before: why nottype
?) is becauseFalse, True
are data values. (Other declarations declare for example type aliases, which don't mention data values.)The
|
separator between the member values is introducing a 'sum type' aka 'tagged union', as in sum-of-product types. It is (theoretically) how all base types are defineddata Int = -9223372036854775808 | ... | -2 | -1 | 0 | 1 | 2 | ... | 9223372036854775807 data Char = '\NUL' | ... | ' ' | 'A' | 'B' | ... | 'a' | 'b' | ... '\1114111'
- data Int = -9223372036854775808 | ... | -2 | -1 | 0 | 1 | 2 | ... | 9223372036854775807
- data Char = '\NUL' | ... | ' ' | 'A' | 'B' | ... | 'a' | 'b' | ... '\1114111'
data Int = -9223372036854775808 | ... | -2 | -1 | 0 | 1 | 2 | ... | 9223372036854775807 data Char = '\NUL' | ... | ' ' | 'A' | 'B' | ... | 'a' | 'b' | ... '\1114111'
- data Int = -9223372036854775808 | ... | -2 | -1 | 0 | 1 | 2 | ... | 9223372036854775807
- data Char = '\NUL' | ... | ' ' | 'A' | 'B' | ... | 'a' | 'b' | ... '\1114111'
If the above is "(theoretically) how all base types are defined", how are base types like Int and Char actually defined?
I've described how the semantics of Haskell is presented to learners. I'm sure there is some simplification for tutorial purposes. And there has to be some eliding the truth to yank on the bootstraps. At least, that definition for Bool
you can see verbatim in the language Prelude
that is, the standard/usual library that comes with every compiler.
Is it compiler magic (i.e., baked into the compiler), or do definitions like the above for Int and Char live somewhere in the Haskell standard library (or equivalent)?
The language standard is clearly split into two parts: 1) the formal syntax and semantics; 2) the standard libraries (which include declarations for Bool
, Char, Int, Float
and the usually expected numerical operations, including comparisons yielding Bool
). Every compiler is expected to support/implement the standard libraries, even if some program chooses not to import them (or not them all). Furthermore in practice every compiler takes advantage of seeing library-defined types to generate more efficient machine code.
The syntax for Int, Float, Char (and String), tuples
(and a few other base types) is not valid for user-defined values (which must be simple names). So yes there is compiler magic to recognise that special syntax and turn it into internal representations of those values. But from then on, those values behave as if user-defined. A (non-parameterised/non-polymorphic) type is just a set of values, per RM Pre 1; and each possrep must belong to exactly one type, per RM Pre 2.
Either way, the answer to Hugh's question ("How do you get off the ground, so to speak, without any system-defined scalar types?") appears to be that numeric and character literals are special, baked into the compiler, and at least (if it's the latter case) predefined (by the compiler) to be notionally "typeful" to the extent that, for example, -9223372036854775808 | ... | -2 is recognised to represent a range of ordinal numeric values.
Or is that not how it works?
Two things: a) the range of type Int
(in the standard library) is implementation-dependent. That's the range on my machine (64-bits). The language standard requires support for at least 31-bits. The range of type Integer
(in the standard library) is arbitrary precision (to some IEEE standard/using a C library implementation). Similarly the Char
encoding is implementation-dependent (expected to be at least UTF-8 compliant).
But b) no it's not 'baked in' , in the sense if you really, really want to build your own model of numeric types, the compiler will help you do that. (And that is a realistic use case: people use their own representations if they want more precision than Int
or Float, Double
but better efficiency than IEEE standards -- for example to do fancy array manipulations.) The key thing is that fromIntegral
I mentioned (and mis-spelt); there's also a fromRational
. Those convert from a secret/implementation-defined numeric format to your custom format. You must supply definitions/overloadings for those two functions, as well as declaring your numeric types. (And probably those declarations will be machine-level, so you're using some escape-hatch 'Foreign Function Interface' to get outside official Haskell.)
Token 1234
appearing in a program is syntactic sugar for fromIntegral 1234
. So the token 1234
must appear with a type annotation or typeful context that gives the return type of the expression, in order for the compiler to resolve which overloading of method fromIntegral
to apply. These are all valid sourcecode
x = 1234 :: Int -- type annotation on the value, then 1234.0 not valid y = 1234 :: Float -- could be written 1234.0 z :: Double -- type annotation on the variable z = 1234 w = sqrt z -- type of w inferred from function sqrt applied to arg z
But I didn't say: in fromIntegral 1234
, what's the type of that 1234
. Again it's determined from the typeful context. Although method fromIntegral
is polymorphic/overloaded in its return type, its argument type is compiler-determined (and a user library can't override that).
If the explanation above is still not technical enough, you've probably exhausted the depths of my understanding. (As an end-user, I'm happy to accept the tutorial explanation, whilst being aware it's something of a fairy-story.)