The Forum for Discussion about The Third Manifesto and Related Matters

Please or Register to create posts and topics.

"And the moral is, don't use Excel as a scientific database ..."

PreviousPage 2 of 5Next
Quote from Dave Voorhis on January 26, 2020, 11:50 am
Quote from dandl on January 26, 2020, 7:23 am
Quote from AntC on January 26, 2020, 5:58 am

...

I'm so used to database thinking, I can't even see what mental hurdle it is you're talking about. So I guess I (or anybody else here) am entirely unsuited to design the Next Big Thing (TM).

Typical RDBMS have the opposite problem: you have to create the abstraction of columns and types before you can stick any data into it. Types are pervasive, and can really get in the way when you just want to play with data and add a bit of basic formatting.

That turns out to be easy to fix from an interaction point of view. You can let the user simply plunk data into an arbitrary table/sheet and not worry about types, at least at first. What goes on behind the scenes is a bit more complex, as the engine sometimes has to convert data from one type to another and it sometimes fails (for reasonable reasons) and you sometimes have to tell the user and help him deal with it by fixing the data or specifying a type. But I think that's solvable and results in usable interaction.

I agree. The product I have in mind would import arbitrary table-ish data, make a guess at meaningful types and let you refine the types and column attributes progressively to suit needs, all in a very concrete metaphor (concrete being the antithesis of abstract), and all without data loss (like converting MARCH20 into a date, you can't go back to the original text). I haven't found one that does that at all well.

Formatting is something else. As I suggested in a previous post, conflating formatting and data rapidly and inevitably becomes an abomination, no matter how convenient and superficially desirable it seems at first. Data and information ("information is data in formation") -- i.e., data entry and reports -- need to be kept separate and appreciated to be distinct, or we wind up back in spreadsheet hell.

I agree, although it actually isn't quite that simple. Data type in RDBMS speak is closely tied to storage strategy and users don't care about that. It's more useful to have 'types' such as: email, image, URL, list of values (eg red/yellow/green), ISBN, lookup (in another table), complex numbers, points (X,Y), values with units (1.5 kg, 42 degrees C), geolocation (lat/long), etc. These 'types' need things like regex for parsing and various formatting options. Example:

  • the label for this column is 'Local club websites'
  • the values in this column are intended to be URLs
  • they are stored, imported, exported and can be manipulated in formulae as text
  • they conform to this regex (those that do not conform will be highlighted in red)
  • they are formatted with this pattern (blue text, underlined, hover, clickable).

Again, I don't know any, but Excel is generally better than Access. Business apps write layers of code on top of SQL to do that. It should be simple, but it definitely isn't easy.

Andl - A New Database Language - andl.org
Quote from Dave Voorhis on January 26, 2020, 11:37 am
Quote from AntC on January 26, 2020, 5:58 am

I'm so used to database thinking, I can't even see what mental hurdle it is you're talking about. So I guess I (or anybody else here) am entirely unsuited to design the Next Big Thing (TM).

That's it. You're used to database thinking, I'm used to database thinking, and we've all been doing technical computery stuff for nigh unto half a century, give or take. Therefore, an awful lot of what seems intuitively obviously common knowledge to us is mind-bendingly complicated, confusing, and abstract to the beginner or average non-technical user.

And an awful lot of what we think is right because obviously we're the experts, is totally and unequivocally wrong when we try to impose our database rules on a different domain. We have mostly had enough exposure to common business practices to avoid dropping clangers when dealing with accounts, HR and marketing, but what do we know about the data requirements and processes inherent in lab work, mineral exploration, aerospace, chemistry, genetics, etc? These are highly technical users with sophisticated needs operating at an expert level, and our generic database and programming tools look naive, primitive and counter-intuitive to their eyes. And they're right.

Andl - A New Database Language - andl.org
Quote from dandl on January 27, 2020, 12:02 am
Quote from Dave Voorhis on January 26, 2020, 11:50 am
Quote from dandl on January 26, 2020, 7:23 am
Quote from AntC on January 26, 2020, 5:58 am

...

I'm so used to database thinking, I can't even see what mental hurdle it is you're talking about. So I guess I (or anybody else here) am entirely unsuited to design the Next Big Thing (TM).

Typical RDBMS have the opposite problem: you have to create the abstraction of columns and types before you can stick any data into it. Types are pervasive, and can really get in the way when you just want to play with data and add a bit of basic formatting.

That turns out to be easy to fix from an interaction point of view. You can let the user simply plunk data into an arbitrary table/sheet and not worry about types, at least at first. What goes on behind the scenes is a bit more complex, as the engine sometimes has to convert data from one type to another and it sometimes fails (for reasonable reasons) and you sometimes have to tell the user and help him deal with it by fixing the data or specifying a type. But I think that's solvable and results in usable interaction.

I agree. The product I have in mind would import arbitrary table-ish data, make a guess at meaningful types and let you refine the types and column attributes progressively to suit needs, all in a very concrete metaphor (concrete being the antithesis of abstract), and all without data loss (like converting MARCH20 into a date, you can't go back to the original text). I haven't found one that does that at all well.

Formatting is something else. As I suggested in a previous post, conflating formatting and data rapidly and inevitably becomes an abomination, no matter how convenient and superficially desirable it seems at first. Data and information ("information is data in formation") -- i.e., data entry and reports -- need to be kept separate and appreciated to be distinct, or we wind up back in spreadsheet hell.

I agree, although it actually isn't quite that simple. Data type in RDBMS speak is closely tied to storage strategy and users don't care about that. It's more useful to have 'types' such as: email, image, URL, list of values (eg red/yellow/green), ISBN, lookup (in another table), complex numbers, points (X,Y), values with units (1.5 kg, 42 degrees C), geolocation (lat/long), etc. These 'types' need things like regex for parsing and various formatting options. Example:

  • the label for this column is 'Local club websites'
  • the values in this column are intended to be URLs
  • they are stored, imported, exported and can be manipulated in formulae as text
  • they conform to this regex (those that do not conform will be highlighted in red)
  • they are formatted with this pattern (blue text, underlined, hover, clickable).

Again, I don't know any, but Excel is generally better than Access. Business apps write layers of code on top of SQL to do that. It should be simple, but it definitely isn't easy.

Excel better for your bullet points?

I wouldn't attempt it in Excel; it would be an exercise in repetition and futility.

In Access, fairly straightforward. Not as types, unfortunately, but at least very manageable.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org

My bullet points are a single example of the kind of features I would like to see. There are at least dozens more, perhaps hundreds.

Yes, in Excel you can do all this and more, but you have to to it laboriously, manually. And that's what people do, because there is nothing better.

In Access you can do these things by writing code. You work on the abstract representation of  your data model, not on the data itself.

So if the prerequisite is no code and no abstraction, working directly on the data, Excel (and its spreadsheet clones) is the only desktop choice (that I know of).

No code basically means dealing with concrete things, not symbolic representations of things. It allows formulae, as long as they are concrete, but it does not allow symbolic references. It means working directly on data, not working on an abstraction of data. It means applying the same mental model as other creative activities, such as drawing, painting, designing, composing music. It might allow reuse of components, assemblies, tools and so on, but only as concrete things. It might allow extensions to be scripted, but it doesn't require a programmer to use it.

Why is there no AutoCAD or Photoshop or Sibelius or Maya for data? Just Excel. It can't be that hard, can it?

Andl - A New Database Language - andl.org
Quote from dandl on January 25, 2020, 9:36 am

[Actually starts at about 11:30).]

No, there was no evidence presented that this affected the results. Perhaps it just didn't matter, but it certainly is sloppy.

Most of this talk is about spreadsheets containing calculations and various kinds of mistakes in them. There is one throwaway about 'why use Excel to store data if there is no calculation'?

My question is: if you've got some data in tabular format (a CSV) what would you use to maintain it other than Excel? What else is there? Where is the data storage product that is 'better than Excel'?

 

 

You know damn well where they are.

(Hint : If "someone's got data in tabular format", how did that come to be "data" and how did that come to be "in tabular format" ?  Answer : because someone spent time upfront thinking about which sets-of-values (domains !) those "data" values came from, as well as about the meaning conveyed by that data (i.e. predicates !).  So any system that has some notion of "predicates" (even if only from an extreme far distance, e.g. SQL systems) and of "data types" is, by definition, "better than excel".  You've said it yourself in a subsequent post.  Excel deliberately loses both the predicates and types/domains aspect of the data.)

Quote from dandl on January 28, 2020, 2:41 am

My bullet points are a single example of the kind of features I would like to see. There are at least dozens more, perhaps hundreds.

Yes, in Excel you can do all this and more, but you have to to it laboriously, manually. And that's what people do, because there is nothing better.

In Access you can do these things by writing code. You work on the abstract representation of  your data model, not on the data itself.

In both Excel and Access you do these things by writing code. It's the usual spreadsheet expressions and dialog box settings in Excel, VBA and dialog box settings in reports in Access. The effort is notionally the same, because you're doing the same things.

What tends to be a conceptual hurdle for many users -- at least until they've seen it done (though sometimes after, if they reject the notion) -- is that in Access you put the data in tables and the formatting in reports. Excel conflates the two inseparably, which is fine (sort of...) until you need two reports on one set of data.

Or one report on that set of data over there. Then things fall apart.

So if the prerequisite is no code and no abstraction, working directly on the data, Excel (and its spreadsheet clones) is the only desktop choice (that I know of).

No code basically means dealing with concrete things, not symbolic representations of things. It allows formulae, as long as they are concrete, but it does not allow symbolic references. It means working directly on data, not working on an abstraction of data. It means applying the same mental model as other creative activities, such as drawing, painting, designing, composing music. It might allow reuse of components, assemblies, tools and so on, but only as concrete things. It might allow extensions to be scripted, but it doesn't require a programmer to use it.

Why is there no AutoCAD or Photoshop or Sibelius or Maya for data? Just Excel. It can't be that hard, can it?

Excel (and other niche spreadsheet products) are it, if you don't mind the inescapable (and rather dire) limitations of conflating data and presentation.

They have to be separate.

Conceptually, Access does this quite well. It just does it awkwardly. You can't define new column types (and I certainly have no objection to providing pre-built, user-friendly types), the workflow is clunky, changing things is often difficult, and even the good bits are either unrefined or seem half finished (which they probably are), but the essential concept -- separation of data and presentation -- is sound. It needs to be more agile, more fluid, more flexible, more powerful -- and modernised -- but the concept is right.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org

(Hint : If "someone's got data in tabular format", how did that come to be "data" and how did that come to be "in tabular format" ?  Answer : because someone spent time upfront thinking about which sets-of-values (domains !) those "data" values came from, as well as about the meaning conveyed by that data (i.e. predicates !).  So any system that has some notion of "predicates" (even if only from an extreme far distance, e.g. SQL systems) and of "data types" is, by definition, "better than excel".  You've said it yourself in a subsequent post.  Excel deliberately loses both the predicates and types/domains aspect of the data.)

You really don't understand this kind of user at all, do you?

They're currently using Excel, so of course the data is in tabular format. And no, while you can attach labels like 'domains' or 'types' or 'predicates' to things,, you're over-thinking the process. The data might be the wine in my cellar, the sheet music for my band, the login sessions on my router, the train timetable, the antibiotic sensitivities for my lab cultures, my daily temperature and rainfall records, etc, etc. It's just stuff I care about that fits into a tabular format, plus annotations and the odd computation.

Data is everywhere. I want to collect some, keep track of stuff, manipulate it, analyse it. Concrete data first, then recognise patterns, then capture those as rules, then figure out how to visualise stuff as an aide to understanding. Excel does it all, badly. Code-based products like Access can't do the first part at all.

Andl - A New Database Language - andl.org
Quote from dandl on January 29, 2020, 7:20 am

Code-based products like Access can't do the first part at all.

Which is why we have disciplines like conceptual modeling.  One example of which is FCO-IM, one of whose core features is that data get exemplified inside the models themselves.

Excel (and nephews) could be a perfectly suitable tool for exemplifying data before data structures are formalized and defined abstractly.  That is not the topic OP addressed.  OP adressed the topic of ***permanently using a spreadsheet as the database***, i.e. ***not moving on*** after you've started out with excel (or nephews), i.e. the attitude of "I'll just keep using my spreadsheet so I can continue muddling around for ever and never have to really think".

Quote from Dave Voorhis on January 28, 2020, 12:42 pm
Quote from dandl on January 28, 2020, 2:41 am

My bullet points are a single example of the kind of features I would like to see. There are at least dozens more, perhaps hundreds.

Yes, in Excel you can do all this and more, but you have to to it laboriously, manually. And that's what people do, because there is nothing better.

In Access you can do these things by writing code. You work on the abstract representation of  your data model, not on the data itself.

In both Excel and Access you do these things by writing code. It's the usual spreadsheet expressions and dialog box settings in Excel, VBA and dialog box settings in reports in Access. The effort is notionally the same, because you're doing the same things.

What tends to be a conceptual hurdle for many users -- at least until they've seen it done (though sometimes after, if they reject the notion) -- is that in Access you put the data in tables and the formatting in reports. Excel conflates the two inseparably, which is fine (sort of...) until you need two reports on one set of data.

Or one report on that set of data over there. Then things fall apart.

It's a useful distinction, but that's not all. I don't want to debate what coding is, except to observe that Access is a lot further along the spectrum than Excel. The distinction I'm making is earlier than yours: it's about separating the data itself from the structure imposed on it. Excel allows you to work with raw data without imposing names, types and other attributes on it: data first; then patterns and structure; then manipulation; then presentation.

So if the prerequisite is no code and no abstraction, working directly on the data, Excel (and its spreadsheet clones) is the only desktop choice (that I know of).

No code basically means dealing with concrete things, not symbolic representations of things. It allows formulae, as long as they are concrete, but it does not allow symbolic references. It means working directly on data, not working on an abstraction of data. It means applying the same mental model as other creative activities, such as drawing, painting, designing, composing music. It might allow reuse of components, assemblies, tools and so on, but only as concrete things. It might allow extensions to be scripted, but it doesn't require a programmer to use it.

Why is there no AutoCAD or Photoshop or Sibelius or Maya for data? Just Excel. It can't be that hard, can it?

Excel (and other niche spreadsheet products) are it, if you don't mind the inescapable (and rather dire) limitations of conflating data and presentation.

They have to be separate.

Conceptually, Access does this quite well. It just does it awkwardly. You can't define new column types (and I certainly have no objection to providing pre-built, user-friendly types), the workflow is clunky, changing things is often difficult, and even the good bits are either unrefined or seem half finished (which they probably are), but the essential concept -- separation of data and presentation -- is sound. It needs to be more agile, more fluid, more flexible, more powerful -- and modernised -- but the concept is right.

Access is not a good model for a 'better Excel'. I just tried it out again, and it really is clunky.

There are many Excel alternatives out there, but all the ones I know are spreadsheets and stick close to the Excel model. Mostly they target personal use, but a few do target a separation of developer and user.

There are many Access alternatives out there, but all the ones I know aim at app development. They all target the separation of developer and user; I don't know any primarily intended for personal use.

 

Andl - A New Database Language - andl.org
Quote from dandl on January 29, 2020, 7:55 am
Quote from Dave Voorhis on January 28, 2020, 12:42 pm
Quote from dandl on January 28, 2020, 2:41 am

My bullet points are a single example of the kind of features I would like to see. There are at least dozens more, perhaps hundreds.

Yes, in Excel you can do all this and more, but you have to to it laboriously, manually. And that's what people do, because there is nothing better.

In Access you can do these things by writing code. You work on the abstract representation of  your data model, not on the data itself.

In both Excel and Access you do these things by writing code. It's the usual spreadsheet expressions and dialog box settings in Excel, VBA and dialog box settings in reports in Access. The effort is notionally the same, because you're doing the same things.

What tends to be a conceptual hurdle for many users -- at least until they've seen it done (though sometimes after, if they reject the notion) -- is that in Access you put the data in tables and the formatting in reports. Excel conflates the two inseparably, which is fine (sort of...) until you need two reports on one set of data.

Or one report on that set of data over there. Then things fall apart.

It's a useful distinction, but that's not all. I don't want to debate what coding is, except to observe that Access is a lot further along the spectrum than Excel. The distinction I'm making is earlier than yours: it's about separating the data itself from the structure imposed on it. Excel allows you to work with raw data without imposing names, types and other attributes on it: data first; then patterns and structure; then manipulation; then presentation.

So if the prerequisite is no code and no abstraction, working directly on the data, Excel (and its spreadsheet clones) is the only desktop choice (that I know of).

No code basically means dealing with concrete things, not symbolic representations of things. It allows formulae, as long as they are concrete, but it does not allow symbolic references. It means working directly on data, not working on an abstraction of data. It means applying the same mental model as other creative activities, such as drawing, painting, designing, composing music. It might allow reuse of components, assemblies, tools and so on, but only as concrete things. It might allow extensions to be scripted, but it doesn't require a programmer to use it.

Why is there no AutoCAD or Photoshop or Sibelius or Maya for data? Just Excel. It can't be that hard, can it?

Excel (and other niche spreadsheet products) are it, if you don't mind the inescapable (and rather dire) limitations of conflating data and presentation.

They have to be separate.

Conceptually, Access does this quite well. It just does it awkwardly. You can't define new column types (and I certainly have no objection to providing pre-built, user-friendly types), the workflow is clunky, changing things is often difficult, and even the good bits are either unrefined or seem half finished (which they probably are), but the essential concept -- separation of data and presentation -- is sound. It needs to be more agile, more fluid, more flexible, more powerful -- and modernised -- but the concept is right.

Access is not a good model for a 'better Excel'. I just tried it out again, and it really is clunky.

Yes, it is.

But it's better than Excel when you want multiple views of data, or have external data, or need to combine multiple data sources. The Excel "spreadsheet" model is only really good as the final endpoint of a data processing chain. It's good for paddling about in final result data, or for producing a presentation view of final result data. As such, it's a good custom report generator -- of a particular kind of custom report -- but it's a terrible data processor. It's nigh unto worthless for joining and filtering and manipulating data as a midpoint in a processing chain for consumption by something else, like (but not limited to) other Excel spreadsheets.

Access is a much, much, much better data processor than Excel. It's also good for different kinds of reports; ones that don't fit the "spreadsheet" model. For most such use cases, that involves no more code than spreadsheets, where "code" means expressions or fomulae. Spreadsheet formulae are code, of course.

What Access does involve is a very different approach from Excel, which can be a steep learning curve -- and a big conceptual hurdle for the uninitiated -- particularly if you're used to Excel.

There are many Excel alternatives out there, but all the ones I know are spreadsheets and stick close to the Excel model. Mostly they target personal use, but a few do target a separation of developer and user.

There are many Access alternatives out there, but all the ones I know aim at app development. They all target the separation of developer and user; I don't know any primarily intended for personal use.

True. There is Excel on one side with a cluster of less-known spreadsheet and spreadsheet-like tools around it, and there is Access on the other side with a cluster of mostly well-known application development tools around it.

But they are opposite ends of a wide spectrum. There is a vast, empty space between them.

What I'm working on is something that stands in that empty space, roughly halfway between Excel and Access. It's notionally closer to the Access side than the Excel side, but it has clear influences from both, and there's still a very, very large empty space on each side. It isn't an Excel clone, and it isn't an Access clone. It doesn't look or work like either one.

My target market is me, though I hope power users and developers will like it too. I don't think casual users -- or hardcore Excel users -- will shift. I think I have a chance with developers and power users who are as frustrated as I am with Excel, Access, and the usual database application development tools.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
PreviousPage 2 of 5Next