The Forum for Discussion about The Third Manifesto and Related Matters

You need to log in to create posts and topics.

Brave New World databaspreadsheet

12

(The previous 'Pre-relational database models' thread had wandered way off topic; there's nothing TTMish here.)

Quote from Dave Voorhis on August 6, 2019, 2:47 pm
>> Quote from johnwcowan on August 6, 2019, 2:27 pm
>> Quote from AntC on August 6, 2019, 11:26 am
>> Aside: eh? Spreadsheets are the toolset for ensuring the mediocre do foul things up, and even the very able can barely avoid fouling up.

>> See Lotus Improv and its current successor, Quantrix Modeler, for spreadsheets done right. A precis:

Yes, MS Excel and its progenitors (Lotus 1-2-3, Visicalc) and main general-purpose competitors (Google Docs Spreadsheet, Open Office spreadsheet, Numbers for MacOS, etc.) demonstrate an awful lot of unpleasant "we've always done it that way" baggage.

Excel is my go-to example of bloatware. Even when I was an intensive user, there were gazillions of tool buttons/menu options I never used and couldn't imagine any use for. (Of course someone somewhere had a use for them, but equally couldn't imagine a use for the options I used.) Possible Excel could offer the defense of backwards compatibility, but all the clones offer exactly the same dysfunctional user interface.

Excel and clones is also my go-to example of how OO/encapsulation hasn't improved user interface design: why couldn't those clumps of functionality be encapsulated and be made context-sensitive? Such that you only saw the gazillions of options in places (groups of cells) where they made sense.

Alternatives pop up every so often and either carve out a specialist niche -- like Quantrix in finance -- or vanish without a trace. I've been pointed to a number of them over the years -- all touted as being the next big thing since Excel; some touted as being poised to replace Excel -- and then... Nothing.

A significant proportion of the bloat is formatting (including graphing) -- which has nothing to do with the 'business' of spreadsheets -- that is, manipulating values. And formatting stuff in grids you'd think is a presentation layer not specific to spreadsheets: it applies equally to presenting database results in grids or grids/tables/graphs in Word processing or presenting data from HTML/XML/JSON. If encapsulation/parametricity is so wonderful, why isn't there a generalised templating layer that takes a 'raw' grid from any source and pretties it up?

I suppose ... do cross-totalling/subtotalling count as formatting or manipulating values? Then how about expressing this cell's value as a % deviation from the mean? Then how about conditional highlighting of wild variations? Where does manipulating values end and 'mere' formatting begin? And we'd want to use the same syntax for formulas; and ... we're already too far down the slippery slope; having forgotten to lash on an anchor point at the top. Where did all the software design discipline go to?

Of course, that doesn't stop me from trying -- my post-Rel product-in-progress draws some inspiration from spreadsheets (but don't read too much into that), at least in terms of their intuitive agility, but (like Quantrix) without some of the obvious flaws. Hopefully, once it's further along the development cycle it will find a niche rather than vanishing without a trace.

There's a family of bolt-ons to Excel that try to reverse engineer your 'database schema' (in effect) by examining the structure of formulas. Of course they fall flat on their face when they come across the random cell where someone's poked a hard-coded value because sometime 3 years ago they couldn't get the formula to produce the right value.

The problem with tools like Quantrix, or even 'array formulas' is they need too much advance planning of your structure. Where "too much" = any at all. As opposed to just throwing values in cells and inserting columns/rows later. This 'informality' is often blamed on the accountancy origins of spreadsheets; but that's misplaced: if you have ruled columns/rows on paper you can't go shifting something right/down.

So now database tools like MS Access are trying to become equally unstructured: start putting your data in a grid; shift it around; sneakily guess at the 'schema' afterwards.

I'm dubious anybody's going to find a "niche". If you can approximate it in Excel, that's what peole will do. And then put such intellectual investment into making it behave -- in effect work-rounds for Excel's dysfunctional user interface/lack of structure -- that any other tool offering to do it 'properly' will seem alien/too much of a learning curve. And too expensive: Excel is a 'sunk cost' so effectively free. That's why it's such a bargain when the highest-paid executives in the company put hours into polishing their formatting rather than their proper day jobs.

Quote from AntC on August 8, 2019, 6:06 am

(The previous 'Pre-relational database models' thread had wandered way off topic; there's nothing TTMish here.)

Quote from Dave Voorhis on August 6, 2019, 2:47 pm
>> Quote from johnwcowan on August 6, 2019, 2:27 pm
>> Quote from AntC on August 6, 2019, 11:26 am
>> Aside: eh? Spreadsheets are the toolset for ensuring the mediocre do foul things up, and even the very able can barely avoid fouling up.

>> See Lotus Improv and its current successor, Quantrix Modeler, for spreadsheets done right. A precis:

Yes, MS Excel and its progenitors (Lotus 1-2-3, Visicalc) and main general-purpose competitors (Google Docs Spreadsheet, Open Office spreadsheet, Numbers for MacOS, etc.) demonstrate an awful lot of unpleasant "we've always done it that way" baggage.

...

Of course, that doesn't stop me from trying -- my post-Rel product-in-progress draws some inspiration from spreadsheets (but don't read too much into that), at least in terms of their intuitive agility, but (like Quantrix) without some of the obvious flaws. Hopefully, once it's further along the development cycle it will find a niche rather than vanishing without a trace.

There's a family of bolt-ons to Excel that try to reverse engineer your 'database schema' (in effect) by examining the structure of formulas. Of course they fall flat on their face when they come across the random cell where someone's poked a hard-coded value because sometime 3 years ago they couldn't get the formula to produce the right value.

The problem with tools like Quantrix, or even 'array formulas' is they need too much advance planning of your structure. Where "too much" = any at all. As opposed to just throwing values in cells and inserting columns/rows later. This 'informality' is often blamed on the accountancy origins of spreadsheets; but that's misplaced: if you have ruled columns/rows on paper you can't go shifting something right/down.

So now database tools like MS Access are trying to become equally unstructured: start putting your data in a grid; shift it around; sneakily guess at the 'schema' afterwards.

I'm dubious anybody's going to find a "niche". If you can approximate it in Excel, that's what peole will do. And then put such intellectual investment into making it behave -- in effect work-rounds for Excel's dysfunctional user interface/lack of structure -- that any other tool offering to do it 'properly' will seem alien/too much of a learning curve. And too expensive: Excel is a 'sunk cost' so effectively free. That's why it's such a bargain when the highest-paid executives in the company put hours into polishing their formatting rather than their proper day jobs.

Quantrix has found a niche. Some do. Many don't.

My target is not the happy Excel user but the unhappy programmer who wants to be able to easily churn out database-driven applications (for personal desktop, workgroup, intranet or internet, on all the usual Windows/Mac/Linux/Browser/mobile platforms) with something notionally akin to the ease of creating Excel spreadsheets, but without all the irritations and limitations -- which are myriad -- of actually using Excel spreadsheets to do it.

Or, without all the pitfalls (myriad, again) of using Microsoft Access to do it.

Or, without all the fine-grained build-a-house-out-of-toothpicks effort of using C#/Java/PHP/Python/Node backend + Javascript-framework-du-jour frontend to do it, and without getting locked into an obscure and proprietary toolchain. Per my other post, think Java amplified or Java for data. (Q: Why Java? A: Popular, cross-platform, vast libraries of preexisting code, almost universally accepted -- or at least tolerated/endured -- by enterprise developers.)

I'd call it "The Datasheet" if the electronics engineering world hadn't pretty much locked the term down since the 1920's. Though I might, anyway.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
Quote from Dave Voorhis on August 8, 2019, 6:41 am

My target is not the happy Excel user but the unhappy programmer who wants to be able to easily churn out database-driven applications (for personal desktop, workgroup, intranet or internet, on all the usual Windows/Mac/Linux/Browser/mobile platforms) with something notionally akin to the ease of creating Excel spreadsheets, but without all the irritations and limitations -- which are myriad -- of actually using Excel spreadsheets to do it.

I should point out that my model for that programmer is me. I want to use it to easily churn out database-driven applications. If I can do that, and make money creating bespoke database-driven applications (which I've done before) using it, then my goal is achieved. If others can use it that way too, that's a bonus.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
Quote from Dave Voorhis on August 8, 2019, 6:46 am
Quote from Dave Voorhis on August 8, 2019, 6:41 am

My target is not the happy Excel user but the unhappy programmer who wants to be able to easily churn out database-driven applications (for personal desktop, workgroup, intranet or internet, on all the usual Windows/Mac/Linux/Browser/mobile platforms) with something notionally akin to the ease of creating Excel spreadsheets, but without all the irritations and limitations -- which are myriad -- of actually using Excel spreadsheets to do it.

I should point out that my model for that programmer is me. I want to use it to easily churn out database-driven applications. If I can do that, and make money creating bespoke database-driven applications (which I've done before) using it, then my goal is achieved. If others can use it that way too, that's a bonus.

The funny thing is we've been trying to do that for about 40 years (that I can remember). Just simple 'Window on Data': forms, tables and reports. Basic form-filling, data grid, modest data validation, limited links between tables, basic reporting with totals and sub-totals, covers something like 90% of what most people want. Add to that the ability to plug in some custom code to deal with special requirements and we're done.

Why is it so hard? Good luck if you can do it!

Andl - A New Database Language - andl.org
Quote from dandl on August 8, 2019, 8:05 am
Quote from Dave Voorhis on August 8, 2019, 6:46 am
Quote from Dave Voorhis on August 8, 2019, 6:41 am

My target is not the happy Excel user but the unhappy programmer who wants to be able to easily churn out database-driven applications (for personal desktop, workgroup, intranet or internet, on all the usual Windows/Mac/Linux/Browser/mobile platforms) with something notionally akin to the ease of creating Excel spreadsheets, but without all the irritations and limitations -- which are myriad -- of actually using Excel spreadsheets to do it.

I should point out that my model for that programmer is me. I want to use it to easily churn out database-driven applications. If I can do that, and make money creating bespoke database-driven applications (which I've done before) using it, then my goal is achieved. If others can use it that way too, that's a bonus.

The funny thing is we've been trying to do that for about 40 years (that I can remember). Just simple 'Window on Data': forms, tables and reports. Basic form-filling, data grid, modest data validation, limited links between tables, basic reporting with totals and sub-totals, covers something like 90% of what most people want. Add to that the ability to plug in some custom code to deal with special requirements and we're done.

Why is it so hard? Good luck if you can do it!

I think it's not so much that this is particularly hard, but that everything in software looks superficially simple but turns out to be really, really difficult.

Everything. There is no easy in it, anywhere.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
Quote from AntC on August 8, 2019, 6:06 am

A significant proportion of the bloat is formatting (including graphing) -- which has nothing to do with the 'business' of spreadsheets -- that is, manipulating values.

Studies show that the actual business of spreadsheets is mostly a convenient UI for table editing and display.  The last spreadsheet I wrote has no numbers (except ordinals) anywhere: it's about characters in a story.   The rows are chapter-plus-scene numbers, the columns are, for each character, a triplet of "What do they do?", "What happens to them?" and "What do they learn?"  In Quantrix I'd have made it four-dimensional (chapter, scene, character, questions), and I would have been able to rearrange it on the fly to group or paginate so as to get the four dimensions into two.  The next one will probably have no numbers either: it has rows denoting concepts and several columns denoting environments, and the cells contain the name used for that concept in that environment.

That said, if I were to develop an open source sort of Quantrix, it would probably defer all formatting to a conventional spreadsheet like Excel or LibreOffice.

I suppose ... do cross-totalling/subtotalling count as formatting or manipulating values? Then how about expressing this cell's value as a % deviation from the mean? Then how about conditional highlighting of wild variations?

I think that's straightforward.  Using attribute to mean row, column, or higher-dimensional counterpart, conditional formatting is a series of rules of the form "If attribute P is greater than or equal to L_n and less than U_n, then apply <some format> to the corresponding cells of attribute Q (which may be the same as P).  That pushes all the computation into the spreadsheet, is easy to represent as a UI, and requires only trivial comparisons of a cell with two fixed values at display time, no generalized formula engine.  Q can of course be hidden if its value is not interesting.

The problem with tools like Quantrix, or even 'array formulas' is they need too much advance planning of your structure.

I wouldn't have had to plan the above structure that much.  I started with a basic idea of "Who's in each scene?" and went on to add the chapter and question pseudo-dimensions as I needed them.  It was very annoying to have to make sure that I always created three columns for each new character; adding a new character should just be adding a new dimension, period.  Can we say "skeuomorphism sucks"?

Quote from johnwcowan on August 8, 2019, 2:00 pm
Quote from AntC on August 8, 2019, 6:06 am

A significant proportion of the bloat is formatting (including graphing) -- which has nothing to do with the 'business' of spreadsheets -- that is, manipulating values.

Studies show that the actual business of spreadsheets is mostly a convenient UI for table editing and display.

Yes. Experience from my erstwhile workplace suggested that spreadsheets were almost entirely about arranging text in columns, and sometimes rows, where a word processor would have been too confusing or too much work. The admin team didn't even call them spreadsheets. Spreadsheet documents were called "trackers".  E.g., "We've sent you a tracker with your new students. Please assign them personal tutors immediately and return the tracker by 2pm tomorrow or your line manager will be notified."

I've no idea how that terminology started. Referring to a "tracker" as a spreadsheet in front of the right (wrong?) people drew dirty looks. Trackers sometimes contained formulae, but it wasn't unusual to see columns that could have held formulae filled in manually, presumably calculated using a pocket calculator or another spreadsheet (each set of values and operators typed individually into the same cell), or perhaps on paper.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
Quote from johnwcowan on August 8, 2019, 2:00 pm
Quote from AntC on August 8, 2019, 6:06 am

Studies show that the actual business of spreadsheets is mostly a convenient UI for table editing and display.  The last spreadsheet I wrote has no numbers (except ordinals) anywhere: it's about characters in a story.   The rows are chapter-plus-scene numbers, the columns are, for each character, a triplet of "What do they do?", "What happens to them?" and "What do they learn?"

Interesting. I use OneNote to do that exact task. I find it a far better table editor than Excel, and it exports/imports to Word/Excel as needed.

What do you use where calculation really is important, but the data is personal (no shared editing required)? Examples:

  • a log book for the car to make mileage claims
  • training records for a half-marathon
  • Scientific/engineering reference tables (energy density, efficiency of lighting, area required for power generating plant to name but a few I keep)
  • investment return on property (linked tables)

I use Excel, but it's not good when the task is mostly adding new records and/or there are linked tables.

Andl - A New Database Language - andl.org

I'm having visions of an excel (/office calc) where only the file-open and file-save are different.

File-Open connects to a database, and displays a "workbook" with as many tabs as the database has tables (oops relvars).  Or perhaps the Open... dialog allows the user to select which ta...relvars to display.  Perhaps the Open... dialog also checks the read authority the user has on each relvar and won't even show the ones for which he doesn't.

If the policy on the database is such that all changes must go through "stored procedures" (not an uncommon policy in SQL environments) then the sheets are shown in read-only and editing is not allowed.

File-Save writes all unsaved edits to the database.  As a consequence, constraints are only checked at save time.  Hmmmmmm.  And using optimistic locking on the tuple value of course, so File-Save must now be prepared to handle rejections from the DBMS.  But that still leaves my formatting unsaved.  So I'd guess I'd still write that in an xlsx-style workbook file.  So now there's a dependency between the workbook file and the database which requires proper maintenance (and of course the File-Open dialog must allow to pick the formatting workbook from local store too).  Of course the formatting info cannot be of the kind "row 7 has height xyz", but it must be of the kind "the row whose abc key value is pqr, has height xyz".  My idea is beginning to fall apart.

Database Listeners delete rows in the spreadsheet and change cell values as updates are done to the database by other users, and also if such updates follow from CASCADE DELETES and such initiated by the spreadsheet user.  That requires unrestricted updatable view technology, which is the same as CREATE ASSERTION technology, which is not broadly offered by DBMS products.  Also, a DELETE to be reflected in the spreadsheet might clash with an edit the spreadsheet user has pending.  I think I'll just stop dreaming.

Quote from dandl on August 9, 2019, 12:36 am
Quote from johnwcowan on August 8, 2019, 2:00 pm
Quote from AntC on August 8, 2019, 6:06 am

Studies show that the actual business of spreadsheets is mostly a convenient UI for table editing and display.  The last spreadsheet I wrote has no numbers (except ordinals) anywhere: it's about characters in a story.   The rows are chapter-plus-scene numbers, the columns are, for each character, a triplet of "What do they do?", "What happens to them?" and "What do they learn?"

Interesting. I use OneNote to do that exact task. I find it a far better table editor than Excel, and it exports/imports to Word/Excel as needed.

What do you use where calculation really is important, but the data is personal (no shared editing required)? Examples:

  • a log book for the car to make mileage claims
  • training records for a half-marathon
  • Scientific/engineering reference tables (energy density, efficiency of lighting, area required for power generating plant to name but a few I keep)
  • investment return on property (linked tables)

I use Excel, but it's not good when the task is mostly adding new records and/or there are linked tables.

I used to use MS Access for this sort of thing. Now I use Rel.

Both are fine but lack agility. That's a big part of what I'm trying to address with my datasheet thingy.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
12