Which Reality?
Quote from Paul Vernon on December 4, 2021, 11:12 pmQuote from Erwin on December 4, 2021, 9:39 pmQuote from Paul Vernon on December 4, 2021, 12:20 pmI would say that if you "add semantics to a "bare" individual" you have a new individual.
I would say this is not respecting the distinction that DBE was trying to make between 'individuals' and 'values' : I would say that any act of "adding semantics" (in the sense as it arose in this discussion) is precisely what gets you from 'individual' to 'value', therefore the result of that act can ***never*** be "a new individual" (nor should it).
I agree (or agree enough anyway)
Even if the semantics added are as "superficial" as "we're talking just any number here" as opposed to "we're talking squares of weights here".
Still, however, I just don't think we need types to add semantics. We have already got attribute names and they can hold a good amount of semantics, and we have individuals which we can put into sets (inside the database) to allow them to have semantics. I just don't see the need for the extra concept of type - it is just somewhere else to stuff semantics. We are better off (things are simpler) if all we have are individuals and the ability to pair them up with names in attributes in tuples in the database.
Maybe what I need to be shown is an incontrovertible example (just one would do) where a given individual has to be paired with a type name to disambiguate it from the same individual that means something different when standing alone without a type pairing.
The individual
17
that is the square of weights is not such an example for me. For me, that is a different individual.17 kg²
is the (literal for the) individual that corresponds to the hypothetical thing "17 square kilograms"
Quote from Erwin on December 4, 2021, 9:39 pmQuote from Paul Vernon on December 4, 2021, 12:20 pmI would say that if you "add semantics to a "bare" individual" you have a new individual.
I would say this is not respecting the distinction that DBE was trying to make between 'individuals' and 'values' : I would say that any act of "adding semantics" (in the sense as it arose in this discussion) is precisely what gets you from 'individual' to 'value', therefore the result of that act can ***never*** be "a new individual" (nor should it).
I agree (or agree enough anyway)
Even if the semantics added are as "superficial" as "we're talking just any number here" as opposed to "we're talking squares of weights here".
Still, however, I just don't think we need types to add semantics. We have already got attribute names and they can hold a good amount of semantics, and we have individuals which we can put into sets (inside the database) to allow them to have semantics. I just don't see the need for the extra concept of type - it is just somewhere else to stuff semantics. We are better off (things are simpler) if all we have are individuals and the ability to pair them up with names in attributes in tuples in the database.
Maybe what I need to be shown is an incontrovertible example (just one would do) where a given individual has to be paired with a type name to disambiguate it from the same individual that means something different when standing alone without a type pairing.
The individual 17
that is the square of weights is not such an example for me. For me, that is a different individual. 17 kg²
is the (literal for the) individual that corresponds to the hypothetical thing "17 square kilograms"
Quote from Paul Vernon on December 4, 2021, 11:27 pmQuote from Erwin on December 4, 2021, 10:26 pmSo I have felt for a very long time that replacing that one single "workflow progress status code" with as many relvars as there are steps in the workflow to express "this step has been done (plus when and by whom)" is the way to go. But I didn't want to be declared even more nuts than people were already doing at the time, so I've always shut up about it - until now.
I'll respond more fully tomorrow, but I just want to quickly say. YES! I very much agree that any attribute I ever see in a database that includes the word
Status
invariably make me suspect that it is a poor design that is not really capturing the full picture.It is an interesting topic, although maybe not the best example for me to pick, as we could easily get distracted. Maybe a better example might be, oh I don't know, Currency Code?
Quote from Erwin on December 4, 2021, 10:26 pmSo I have felt for a very long time that replacing that one single "workflow progress status code" with as many relvars as there are steps in the workflow to express "this step has been done (plus when and by whom)" is the way to go. But I didn't want to be declared even more nuts than people were already doing at the time, so I've always shut up about it - until now.
I'll respond more fully tomorrow, but I just want to quickly say. YES! I very much agree that any attribute I ever see in a database that includes the word Status
invariably make me suspect that it is a poor design that is not really capturing the full picture.
It is an interesting topic, although maybe not the best example for me to pick, as we could easily get distracted. Maybe a better example might be, oh I don't know, Currency Code?
Quote from Erwin on December 5, 2021, 1:07 amQuote from Paul Vernon on December 4, 2021, 11:27 pmas we could easily get distracted. Maybe a better example might be, oh I don't know, Currency Code?
I sort of long for the days when we did nothing else here but get distracted, eurhm, well, "sidetracked". The days when it was discussion list, not the more formal forum that it is now.
"Currency code" is yet a different beast, I think, and not a very problematic one, methinks. Monetary units exist so there is something logical like "the domain of all currencies" and that naturally translates to there having to be some kind of data type "currency identifier" so we can record currency properties (e.g. "exists in cash form", which was NO for the EUR during a period of 3 yrs). I'm not talking of the phenomenon of various distinct encodings that don't even map one to one, such as e.g. the FRF which corresponded to two distinct values in the numerical encoding : one for the FRF as used in France and another for the FRF as used in Guadeloupe. (IIRC those numerical encodings also encompassed the distinction between physical money and, euh, non-physical, so you had 850 for the non-physical Guadeloupian FRF and 851 for the physical Guadeloupian FRF. Or some such.)
Quote from Paul Vernon on December 4, 2021, 11:27 pmas we could easily get distracted. Maybe a better example might be, oh I don't know, Currency Code?
I sort of long for the days when we did nothing else here but get distracted, eurhm, well, "sidetracked". The days when it was discussion list, not the more formal forum that it is now.
"Currency code" is yet a different beast, I think, and not a very problematic one, methinks. Monetary units exist so there is something logical like "the domain of all currencies" and that naturally translates to there having to be some kind of data type "currency identifier" so we can record currency properties (e.g. "exists in cash form", which was NO for the EUR during a period of 3 yrs). I'm not talking of the phenomenon of various distinct encodings that don't even map one to one, such as e.g. the FRF which corresponded to two distinct values in the numerical encoding : one for the FRF as used in France and another for the FRF as used in Guadeloupe. (IIRC those numerical encodings also encompassed the distinction between physical money and, euh, non-physical, so you had 850 for the non-physical Guadeloupian FRF and 851 for the physical Guadeloupian FRF. Or some such.)
Quote from Erwin on December 5, 2021, 1:13 amQuote from Paul Vernon on December 4, 2021, 11:12 pmEven if the semantics added are as "superficial" as "we're talking just any number here" as opposed to "we're talking squares of weights here".
Still, however, I just don't think we need types to add semantics. We have already got attribute names and they can hold a good amount of semantics
BZZZZZZZZZ ...
Attribute names we only have when there are relations. "Type safety" in programming languages / compilers seems to me to be exactly about how to retain those very semantics when there are no relations in sight, as in pieces of code that just do some scalar computation.
Quote from Paul Vernon on December 4, 2021, 11:12 pmEven if the semantics added are as "superficial" as "we're talking just any number here" as opposed to "we're talking squares of weights here".
Still, however, I just don't think we need types to add semantics. We have already got attribute names and they can hold a good amount of semantics
BZZZZZZZZZ ...
Attribute names we only have when there are relations. "Type safety" in programming languages / compilers seems to me to be exactly about how to retain those very semantics when there are no relations in sight, as in pieces of code that just do some scalar computation.
Quote from dandl on December 5, 2021, 7:37 amQuote from Paul Vernon on December 4, 2021, 12:53 pmQuote from Dave Voorhis on December 4, 2021, 12:26 am<aside>
That reminds me a bit of Cyc (see https://en.wikipedia.org/wiki/Cyc)
Some years ago, I used OpenCyc in a research project that for a couple of years was the basis for a public Web site's semantic search feature. Like much of Artificial Intelligence (I use the term loosely) R&D, on some things it was so good it was creepy; on other things, hopelessly bad.
</aside>
Dave. I'm not sure how you got from "SDR" to Cyc, but that is a good link.
@David, do you have a link specific to Sparse Data Representation? The https://en.wikipedia.org/wiki/Sparse_approximation page did not do much for me on first glance.
The thing that got me interested is work by Numenta: https://en.wikipedia.org/wiki/Numenta. They have software and academic papers about the operation of columns of neurons in the cortex.
The main thing is that it suggests a model of computation quite unlike von Neumann computers. SDR's are also used in image processing, spatial location and some other places, but it's really early days, so not much about them as yet.
Quote from Paul Vernon on December 4, 2021, 12:53 pmQuote from Dave Voorhis on December 4, 2021, 12:26 am<aside>
That reminds me a bit of Cyc (see https://en.wikipedia.org/wiki/Cyc)
Some years ago, I used OpenCyc in a research project that for a couple of years was the basis for a public Web site's semantic search feature. Like much of Artificial Intelligence (I use the term loosely) R&D, on some things it was so good it was creepy; on other things, hopelessly bad.
</aside>
Dave. I'm not sure how you got from "SDR" to Cyc, but that is a good link.
@David, do you have a link specific to Sparse Data Representation? The https://en.wikipedia.org/wiki/Sparse_approximation page did not do much for me on first glance.
The thing that got me interested is work by Numenta: https://en.wikipedia.org/wiki/Numenta. They have software and academic papers about the operation of columns of neurons in the cortex.
The main thing is that it suggests a model of computation quite unlike von Neumann computers. SDR's are also used in image processing, spatial location and some other places, but it's really early days, so not much about them as yet.
Quote from Paul Vernon on December 5, 2021, 10:41 amQuote from Erwin on December 5, 2021, 1:13 amAttribute names we only have when there are relations. "Type safety" in programming languages / compilers seems to me to be exactly about how to retain those very semantics when there are no relations in sight, as in pieces of code that just do some scalar computation.
That is all true. However as per the Information Principle, all we (should) have are relations. Why do we hanker after what the Jones have? We don't need variables (plural), we just need our one database variable and that holds tuples with attributes with attributes names. Why do we need "scalar computation" when we have our relational operators (and functions as relations)?
As an aside, I'm not aware of any programming languages where variable names are not fully arbitrary. I.e. I don't think there is a programming language that would complain if you create a variable named saySalesDate
but made it of typeTIMESTAMP
, or a variable namedi
but put character values in it. I would be interested in counter examples. I'll take maybe a good example of "linting" or other external conformance checking, but would be really interested in any language example that enforced semantic variable naming "conventions"
Quote from Erwin on December 5, 2021, 1:13 amAttribute names we only have when there are relations. "Type safety" in programming languages / compilers seems to me to be exactly about how to retain those very semantics when there are no relations in sight, as in pieces of code that just do some scalar computation.
SalesDate
but made it of type TIMESTAMP
, or a variable named i
but put character values in it. I would be interested in counter examples. I'll take maybe a good example of "linting" or other external conformance checking, but would be really interested in any language example that enforced semantic variable naming "conventions"Quote from AntC on December 5, 2021, 11:01 amQuote from Paul Vernon on December 5, 2021, 10:41 amQuote from Erwin on December 5, 2021, 1:13 amAttribute names we only have when there are relations. ....
As an aside, I'm not aware of any programming languages where variable names are not fully arbitrary.FORTRAN IV (and I think some early versions of Basic followed suite), slightly liberalised Fortran II:" There are no "type" declarations available: variables whose name starts with I, J, K, L, M, or N are "fixed-point" (i.e. integers), otherwise floating-point. ... The name of a variable must start with a letter and can continue with both letters and digits, up to a limit of six characters ..."I.e. I don't think there is a programming language that would complain if you create a variable named saySalesDate
but made it of typeTIMESTAMP
, or a variable namedi
but put character values in it. I would be interested in counter examples. I'll take maybe a good example of "linting" or other external conformance checking, but would be really interested in any language example that enforced semantic variable naming "conventions"By the time of FORTRAN IV, you could override that default with >gasp< a type declaration. But folk seldom bothered.
Later FORTRAN even allowed variables to hold types other than numbers. Outrage!
IIRC you started posting here by conceding you knew very few programming languages. Then I'd advise you avoid claims beginning "I'm not aware of any programming languages ...". Because no matter how bizarre of a 'feature' you can dream up, it's likely some programming language does that. See 'Esoteric Programming Languages'. I've always loved INTERCAL's
COME FROM
control-flow.
Quote from Paul Vernon on December 5, 2021, 10:41 amQuote from Erwin on December 5, 2021, 1:13 amAttribute names we only have when there are relations. ....
As an aside, I'm not aware of any programming languages where variable names are not fully arbitrary.
I.e. I don't think there is a programming language that would complain if you create a variable named saySalesDate
but made it of typeTIMESTAMP
, or a variable namedi
but put character values in it. I would be interested in counter examples. I'll take maybe a good example of "linting" or other external conformance checking, but would be really interested in any language example that enforced semantic variable naming "conventions"
By the time of FORTRAN IV, you could override that default with >gasp< a type declaration. But folk seldom bothered.
Later FORTRAN even allowed variables to hold types other than numbers. Outrage!
IIRC you started posting here by conceding you knew very few programming languages. Then I'd advise you avoid claims beginning "I'm not aware of any programming languages ...". Because no matter how bizarre of a 'feature' you can dream up, it's likely some programming language does that. See 'Esoteric Programming Languages'. I've always loved INTERCAL's COME FROM
control-flow.
Quote from Paul Vernon on December 5, 2021, 1:55 pmQuote from AntC on December 5, 2021, 11:01 amIIRC you started posting here by conceding you knew very few programming languages. Then I'd advise you avoid claims beginning "I'm not aware of any programming languages ...". Because no matter how bizarre of a 'feature' you can dream up, it's likely some programming language does that. See 'Esoteric Programming Languages'. I've always loved INTERCAL's
COME FROM
control-flow.Well I did post here 16 years ago. This was my first contribution, (and it did also spark a 2nd thread #1 although I did not reply to that one at the time). Indeed my real postings back then number 7.
I'd also note that my claim is strictly true. I.e that I myself am not aware. Well, I could be lying (or be truly deluded, or more likely forgetful), I suppose. But still, I get your point. It is indeed true that there are many bizarre things around.
I'll also note that I'm not sure I exactly said that "I know very few programming languages" - well I may have. I certainly know that there are a lot of programming languages, and by any reasonable definition of the word few, I indeed only know something of a few of them all. It's maybe a bitter sweet part of learning more - the more you learn, the more you learn of things that you don't know about. Hence the percentage of what you know only ever decreases as you learn more. Only the truly ignorant know everything :-)
P.S. Thanks for the link. I particularly like the (new to me) term Turing tarpit - that indeed is a place to avoid. Humm
P.P.S. and this as a good example of how "hackable" JavaScript is
Quote from AntC on December 5, 2021, 11:01 amIIRC you started posting here by conceding you knew very few programming languages. Then I'd advise you avoid claims beginning "I'm not aware of any programming languages ...". Because no matter how bizarre of a 'feature' you can dream up, it's likely some programming language does that. See 'Esoteric Programming Languages'. I've always loved INTERCAL's
COME FROM
control-flow.
Well I did post here 16 years ago. This was my first contribution, (and it did also spark a 2nd thread #1 although I did not reply to that one at the time). Indeed my real postings back then number 7.
I'd also note that my claim is strictly true. I.e that I myself am not aware. Well, I could be lying (or be truly deluded, or more likely forgetful), I suppose. But still, I get your point. It is indeed true that there are many bizarre things around.
I'll also note that I'm not sure I exactly said that "I know very few programming languages" - well I may have. I certainly know that there are a lot of programming languages, and by any reasonable definition of the word few, I indeed only know something of a few of them all. It's maybe a bitter sweet part of learning more - the more you learn, the more you learn of things that you don't know about. Hence the percentage of what you know only ever decreases as you learn more. Only the truly ignorant know everything :-)
P.S. Thanks for the link. I particularly like the (new to me) term Turing tarpit - that indeed is a place to avoid. Humm
P.P.S. and this as a good example of how "hackable" JavaScript is
Quote from Paul Vernon on December 5, 2021, 3:24 pmQuote from Erwin on December 5, 2021, 1:07 amQuote from Paul Vernon on December 4, 2021, 11:27 pmas we could easily get distracted. Maybe a better example might be, oh I don't know, Currency Code?
I sort of long for the days when we did nothing else here but get distracted, eurhm, well, "sidetracked". The days when it was discussion list, not the more formal forum that it is now.
"Currency code" is yet a different beast, I think, and not a very problematic one, methinks. Monetary units exist so there is something logical like "the domain of all currencies" and that naturally translates to there having to be some kind of data type "currency identifier" so we can record currency properties (e.g. "exists in cash form", which was NO for the EUR during a period of 3 yrs). I'm not talking of the phenomenon of various distinct encodings that don't even map one to one, such as e.g. the FRF which corresponded to two distinct values in the numerical encoding : one for the FRF as used in France and another for the FRF as used in Guadeloupe. (IIRC those numerical encodings also encompassed the distinction between physical money and, euh, non-physical, so you had 850 for the non-physical Guadeloupian FRF and 851 for the physical Guadeloupian FRF. Or some such.)
I said I would reply today, so I guess I should.
On your point about attributes named "something Status", can I also submit a similar "complaint" about attributes named "something type". Such attributes also raise big suspicions about a poor data model (or, at least, unimaginative attribute naming). To take an old example :
VAR Phone BASE RELATION { Name CHAR, Type CHAR, Phone# Phone# } KEY { Name, Type };
CONSTRAINT HomeWorkorCell Phone{Type} <= RELATION {TUPLE {Type 'home'}, TUPLE {Type 'work'}, TUPLE {Type 'cell'}};
(where "<=" means "is a subset of").
A "PhoneType" is not a great name (and, arguably "Type" is worse) . I have worked with logical" modellers of a philosophical bent that would name such a thing as maybe "PhoneClassificationByLocationOrLocationVariablity" which I aways thought was a much more thoughtful kind of name. Still that overlooks many things, not least that
'cell'
is really a technology (i.e one key feature of a cellular network) not really about location..'mobile'
would have been a better value, and (of course) you can have mobile phones at work or at home, or elsewhere...
On your currency code information, I guess I'ld want to research into that (https://en.wikipedia.org/wiki/ISO_4217 only list one
FRF
) , but I could well believe it.Still, my point was more the question if the codes should be part of a type, and if so, would the literals be just strings (i.e. currency code would be a sub-type of the set of strings), or should the be 'new' literals.
¤FRF
say (or, but I sort of think this is cheating because it looks like a function, not a literal to me:CURRENCY_CODE('FRF')
)The ISO call them "alphabetic codes" as far as I can see.. so no real indication if they consider them "just" strings, or "something else" that is not strings.
Quote from Erwin on December 5, 2021, 1:07 amQuote from Paul Vernon on December 4, 2021, 11:27 pmas we could easily get distracted. Maybe a better example might be, oh I don't know, Currency Code?
I sort of long for the days when we did nothing else here but get distracted, eurhm, well, "sidetracked". The days when it was discussion list, not the more formal forum that it is now.
"Currency code" is yet a different beast, I think, and not a very problematic one, methinks. Monetary units exist so there is something logical like "the domain of all currencies" and that naturally translates to there having to be some kind of data type "currency identifier" so we can record currency properties (e.g. "exists in cash form", which was NO for the EUR during a period of 3 yrs). I'm not talking of the phenomenon of various distinct encodings that don't even map one to one, such as e.g. the FRF which corresponded to two distinct values in the numerical encoding : one for the FRF as used in France and another for the FRF as used in Guadeloupe. (IIRC those numerical encodings also encompassed the distinction between physical money and, euh, non-physical, so you had 850 for the non-physical Guadeloupian FRF and 851 for the physical Guadeloupian FRF. Or some such.)
I said I would reply today, so I guess I should.
On your point about attributes named "something Status", can I also submit a similar "complaint" about attributes named "something type". Such attributes also raise big suspicions about a poor data model (or, at least, unimaginative attribute naming). To take an old example :
VAR Phone BASE RELATION { Name CHAR, Type CHAR, Phone# Phone# } KEY { Name, Type };
CONSTRAINT HomeWorkorCell Phone{Type} <= RELATION {TUPLE {Type 'home'}, TUPLE {Type 'work'}, TUPLE {Type 'cell'}};
(where "<=" means "is a subset of").
A "PhoneType" is not a great name (and, arguably "Type" is worse) . I have worked with logical" modellers of a philosophical bent that would name such a thing as maybe "PhoneClassificationByLocationOrLocationVariablity" which I aways thought was a much more thoughtful kind of name. Still that overlooks many things, not least that 'cell'
is really a technology (i.e one key feature of a cellular network) not really about location.. 'mobile'
would have been a better value, and (of course) you can have mobile phones at work or at home, or elsewhere...
On your currency code information, I guess I'ld want to research into that (https://en.wikipedia.org/wiki/ISO_4217 only list one FRF
) , but I could well believe it.
Still, my point was more the question if the codes should be part of a type, and if so, would the literals be just strings (i.e. currency code would be a sub-type of the set of strings), or should the be 'new' literals. ¤FRF
say (or, but I sort of think this is cheating because it looks like a function, not a literal to me: CURRENCY_CODE('FRF')
)
The ISO call them "alphabetic codes" as far as I can see.. so no real indication if they consider them "just" strings, or "something else" that is not strings.
Quote from Paul Vernon on December 5, 2021, 3:35 pmQuote from Erwin on December 4, 2021, 10:26 pmMore research is needed. ( :-) )
And then there's the fact that a relation schema that has a boolean attribute X is provably information-equivalent to a design with two relation schema's that both have all the attributes of the single relation schema except X (OPEN_ACCOUNTS and CLOSED_ACCOUNTS, say). (As long as : if X does not participate in all keys of its relation, then an empty-intersection constraint between the two alternative relvars must also be declared.)
Oh yes, relation names. Yet another place to "stuff" semantics.
The solution that I see is not so much more research as simply fewer places where we can put our semantics. With fewer options, it's more likely two different people will consistently pick the same solution - more likely that there might even be a single "obvious", "natural" way to store a given set of information/data.
Quote from Erwin on December 4, 2021, 10:26 pm
More research is needed. ( :-) )
And then there's the fact that a relation schema that has a boolean attribute X is provably information-equivalent to a design with two relation schema's that both have all the attributes of the single relation schema except X (OPEN_ACCOUNTS and CLOSED_ACCOUNTS, say). (As long as : if X does not participate in all keys of its relation, then an empty-intersection constraint between the two alternative relvars must also be declared.)
Oh yes, relation names. Yet another place to "stuff" semantics.
The solution that I see is not so much more research as simply fewer places where we can put our semantics. With fewer options, it's more likely two different people will consistently pick the same solution - more likely that there might even be a single "obvious", "natural" way to store a given set of information/data.