The Forum for Discussion about The Third Manifesto and Related Matters

Please or Register to create posts and topics.

Which type?

PreviousPage 2 of 9Next
Quote from Vadim on November 10, 2021, 4:21 pm

Why a value has to have a single type attached to it, not many? The eternal "is square inherited from rectangle" debate hints that probably there has to be many. Then, as soon as we agreed that there are the two disjoint sets: objects and types, and the connection between them "an object x is associated with type T", then we can frame that tiresome "is square inherited from rectangle" debate into the Formal Concept Analysis. From that perspective a hierarchy of types is formally a lattice.

The other side of your question is why more than one type is needed for a database. IMO, an RDBMS which manipulates string values exclusively is perfectly fine. I would even argue that it is exactly the system that fits for teaching the Database introductory class.

SQLite does that, internally. But it's not enough to expose only a string type. It won't provide expected behaviour for comparisons and expression evaluation. Text strings cannot be used for binary blobs and vice versa. Strings used as numbers are not canonical. And so on.

Andl - A New Database Language - andl.org
Quote from Paul Vernon on November 10, 2021, 8:27 pm

To me,2 appearing in program text denotes, well, the value 2.  I'm not sure I need any more complexity than that.

You've just said

The value 2 is of type Integer, Rational, Real, Prime, Even, Positive, Imaginary etc. I'm not sure that such a multitude should be axiomatic.

We shouldn't use machine representation (PhysRep -- that is, we should use PossReps) to justify differences, but scare-quotes "the" value 2 will have different PhysReps for at least some of those PossReps. So there isn't a single value denoted by 2 appearing in program text. Furthermore we can discriminate within a program (these examples in Haskell):

(2 :: Int) + 2 returns 4 of type Int.

(2 :: Float) + 2 returns 4.0 of type Float. By convention, Haskell's String representation for Floats always shows decimals.

(2 :: Float) / 2 returns 1.0 of type Float.

(2 :: Int) / 2 returns type error: operation / is defined only for numeric types supporting floating-point.

(There are different operations div, mod for operating on integral numerics.)

 

BTW I would not allow any other thing to denote the value 2. Not 02 nor 2.0 nor +2 nor ²⁄₁ nor 2+0i nor, for that matter 1+1

Haskell allows any number of leading zeroes. As I said, trailing decimals denotes floating-point, but they're optional if the program textual context tells us it's floating. So there's plenty of textually different ways to denote one value -- 'one value' in the sense (2.0 :: Float) == 0000002 returns True.

I think it's TTM causing this confusion, by asserting there's a universe of values logically prior to the type system. It can be made to work, but it's messy.

Quote from dandl on November 11, 2021, 12:48 am

[...]

In all the database models I can ever remember seeing, I have never found the need for more than 9 scalar types: boolean, integer, real, decimal, datetime, text string, binary string, enum, struct. (And I regard value inheritance as a pointless thought experiment.)

A programming language may have many more types, but IMO it should support those 9 as scalar value types if it is to be a database programming language (which is the aim of TTM/D). I don't know any that do.

I don't know any popular programming languages that don't support those as user-defined types in a library.

That's the crucial requirement, really -- not that those 9 (give or take) types be baked into the language, but that the language allow the user to define them, along with any other conceivable types that may be deemed desirable, and have them treated as notionally equivalent to built-in-to-the-language types, modulo the usual primitive vs non-primitive type issues.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
Quote from Paul Vernon on November 10, 2021, 11:04 pm

I would use = as per normal practice and have both sides evaluated before doing the comparison.

A type system? Not really. Sure I would have a set of strings and "a" would be a member, but I'ld also have a set of English dictionary words, and "a" would be a member of that too, but not all members of the string set would be members of the word set. I'ld have a subset of strings that correspond to "valid Unicode code-point sequences", and many other interesting subsets of the string set.

What I would not do is point to one set and say "that set is a type" and another set and say "that set is not a type". Not in any fundamental way anyway.  I mean a user might say "what types of values does your database support, and we might reply with, well we have sets, ordered pairs and the atoms from the following non-overlapping named sets: booleans, bits, integers, non-integer-rationals, non-rational-constructable-reals, valid-unicode-strings, invaild-unicode-codpoint-sequences, etc etc. We might even go so far as to say those sets define the "type" of a value, but I'm pretty sure you could not accuse such a thing as being a "type system" (If you can, well, then I guess I am guilty of having a type system even if all that means is that there exists two or more sets that partition the atomic values)

That is a type system.

You could perhaps argue that you have no type system if the only scalar 'type' is a binary string and the only operation on binary strings is a test for equality. That would be sufficient to implement a relational model.

It's an approach taken with some NoSQL systems like the Berkeley DB.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org

If the only scalar 'type' is a binary string and the only operation on binary strings is a test for equality, then you have got two sets - the set of result values from equality (presumably binary string 0 and binary string 1), and the set of all binary strings (i.e. the set of possible inputs to the equality test). That would mean you have two types right? One you might call, oh say Boolean and the other say Everything.  Hence even in such a limited system, you have a "type system".

If it is not really possible to not have a "type system", then I am left unsure about the usefulness of the concept of "type system".

If the presence of a "type system" is directly implied from an axiom of atoms (i.e the existence of scalars) and an of axiom sets (i.e. the existence of collections). Hence the existence of two sets of  things - a set of scalars and a set of sets. Again, I am left unsure about the usefulness of the concept of "type system".

Not that that worries me. I don't find the concept very useful anyway. Very happy to dispense with it! :-)

Quote from Paul Vernon on November 11, 2021, 10:11 am

If the only scalar 'type' is a binary string and the only operation on binary strings is a test for equality, then you have got two sets - the set of result values from equality (presumably binary string 0 and binary string 1), and the set of all binary strings (i.e. the set of possible inputs to the equality test). That would mean you have two types right? One you might call, oh say Boolean and the other say Everything.  Hence even in such a limited system, you have a "type system".

If it is not really possible to not have a "type system", then I am left unsure about the usefulness of the concept of "type system".

If the presence of a "type system" is directly implied from an axiom of atoms (i.e the existence of scalars) and an of axiom sets (i.e. the existence of collections). Hence the existence of two sets of  things - a set of scalars and a set of sets. Again, I am left unsure about the usefulness of the concept of "type system".

Not that that worries me. I don't find the concept very useful anyway. Very happy to dispense with it! :-)

Yes, that is a type system; but (the only) one where you could (perhaps) argue that it isn't because the scalar types aren't distinguishable.

But yes, if (opaque) scalar types are distinguished from non-scalar types, then you have a type system.

Even machine code and languages like FORTH that have no automated type checking have type systems because types are distinguishable -- though all distinguishing of types, and type checking, may only be done by the human using them.

The concept is practically useful when you wish to implement automated type checking. Then you need to consider what are valid and invalid operations on various values of various types, how they relate to the constructs of your language, and so forth.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org

(this post follows on from post #15)

In other words, I'm happy to suggest that it is probably a good practice  to have a set of named sets, where every value is a member of exactly one of them.  (In NFU terms that would be a set of pair-wise disjoint sets, the (boolean) union of which is equal to the universal set, and with a set of pairs that give a (nice, short) name for each set.).

However, I would not be happy to have such a suggestion embedded as an axiom. I.e. I reject the notion that values by definition "carry with them" some information (TTM says "some identification") about what type or types they are "of" (TTM says "of the type to which they belong").

Such information, if it is warranted, needs to be held outside of the values themselves.  Via the information principle, information about "types" needs to be represented as explicit values within the database.

The particular design of that information within the database can be discussed. Very likely we would bandy the word "type" about in such discussion. Certainly we would be interested in seeing which sets are subsets of other sets, and note that such things can enable hierarchies (and other structures) of types. But all that is really just "database design". Breaking the artifice of a necessary fundamental split of information into "what is type" and "what is schema" and "what is data" is, IHMO, a great simplification opportunity.

Quote from Dave Voorhis on November 11, 2021, 10:25 am

The concept is practically useful when you wish to implement automated type checking. Then you need to consider what are valid and invalid operations on various values of various types, how they relate to the constructs of your language, and so forth.

So, back to "type safety" then as the only reason to have a type system. Assuming one can have a system of safety without axiomatic types (and I can't see why you could not), then I'm still searching for any other justification for the foundational role of types in any computer system (well, at least in either database systems or "programming languages" - assuming that there is even a meaningful difference between those two)

Quote from Paul Vernon on November 11, 2021, 10:11 am

If the only scalar 'type' is a binary string and the only operation on binary strings is a test for equality, then you have got two sets - the set of result values from equality (presumably binary string 0 and binary string 1), and the set of all binary strings (i.e. the set of possible inputs to the equality test). That would mean you have two types right? One you might call, oh say Boolean and the other say Everything.  Hence even in such a limited system, you have a "type system".

If it is not really possible to not have a "type system", then I am left unsure about the usefulness of the concept of "type system".

Arguably BCPL (the forerunner to C) didn't have a "type system". From its Reference Manual 1976: "BCPL is based on a simple storage model which consists of a set of consecutively addressed cells ...All cells are of uniform size, and hold a binary bit pattern called a value ... A value is the only primitive data type in BCPL, but it is used to represent an integer, character, truth value, address ..."

The concept of "type system" is useful to talk about different languages' semantics: Static vs dynamic typing; type inference; type erasure -- this it, is there anything inside a compiled program denoting a type?

If you have only Everything other than Boolean, then every time a program goes to add two Everythings, it has to first check whether the contents of the binary string could be interpreted as a number. Or (and this was perfectly possible in BCPL) you could apply integer addition (2's complement) between a bit pattern representing a character and a bit pattern representing a float. Forms of this was even officially sanctioned: add an integer 9 to the character representation for character '0', to get character '9'. Which is fine until you add an integer 24 to '0' and get an upper-case letter.

Then the usefulness o0f a "type system" is to catch a significant number of mistakes.

If the presence of a "type system" is directly implied from an axiom of atoms (i.e the existence of scalars) and an of axiom sets (i.e. the existence of collections). Hence the existence of two sets of  things - a set of scalars and a set of sets. Again, I am left unsure about the usefulness of the concept of "type system".

No, you're talking only about the TTM Type system. Other languages have other axioms or other ways to introduce types and values. Most modern languages have semantics for the types of functions/operators and applying them in expressions, as well as for scalars and data structures, and for polymorphism/overloading/generics.

Not that that worries me. I don't find the concept very useful anyway. Very happy to dispense with it! :-)

I should have asked this earlier: which programming languages do you know/what type systems do they use?

I these days just would not use a programming language (nor a DBMS) without a sophisticated type system.

Quote from Paul Vernon on November 11, 2021, 11:11 am
Quote from Dave Voorhis on November 11, 2021, 10:25 am

The concept is practically useful when you wish to implement automated type checking. Then you need to consider what are valid and invalid operations on various values of various types, how they relate to the constructs of your language, and so forth.

So, back to "type safety" then as the only reason to have a type system.

If you have different kinds of values, you have a type system.

If you have different kinds of constructs that can be syntactically used in the same place, you have a type system.

A type system is essentially unavoidable, unless every value -- or even every interchangeable language construct -- is an opaque blob, indistinguishable from every other.

Assuming one can have a system of safety without axiomatic types (and I can't see why you could not), then I'm still searching for any other justification for the foundational role of types in any computer system (well, at least in either database systems or "programming languages" - assuming that there is even a meaningful difference between those two)

It's not so much that they have a foundational role out of intent or necessity, but that in most practical systems, they have a foundational role out of inevitability. You have different kinds of constructs -- such as values -- and there is a notion of which of them are "right" or "wrong" in some context. Ergo, type system.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
PreviousPage 2 of 9Next