The Forum for Discussion about The Third Manifesto and Related Matters

You need to log in to create posts and topics.

And the moral is, Knime is better than Excel but it sure ain't relational...

After playing with Knime for a while (user level and internals), I make the following observations.

  1. It's big, it can do heaps of stuff, lots of people have been working on it/with it for quite a while.
  2. As far as data analytics goes, it can probably do just about everything you might otherwise do in Excel.
  3. It's clunky to use. The UI could do with a serious rethink. It's based on Eclipse, but uses a mix of SWT and Swing/AWT.
  4. It's the most powerful implementation of a dataflow programming product I've used. The pipes carry tables of records with columns, the nodes generate, transform, combine and display them. The possibilities are endless.
  5. It's hopeless on relational stuff. Perhaps there are things I haven't found yet, but if so they're well hidden.

So that looks like a project!

Andl - A New Database Language - andl.org
Quote from dandl on February 15, 2020, 5:59 am

After playing with Knime for a while (user level and internals), I make the following observations.

  1. It's big, it can do heaps of stuff, lots of people have been working on it/with it for quite a while.
  2. As far as data analytics goes, it can probably do just about everything you might otherwise do in Excel.
  3. It's clunky to use. The UI could do with a serious rethink. It's based on Eclipse, but uses a mix of SWT and Swing/AWT.
  4. It's the most powerful implementation of a dataflow programming product I've used. The pipes carry tables of records with columns, the nodes generate, transform, combine and display them. The possibilities are endless.
  5. It's hopeless on relational stuff. Perhaps there are things I haven't found yet, but if so they're well hidden.

So that looks like a project!

Of course, KNIME is just one thin slice out of a big analytics pie. In KNIME's visual-pipeline tool space, Alteryx (https://www.alteryx.com/) is the big player (along with R, Julia and Python in the language space), but the real grandaddy of them all is SAS (https://www.sas.com/en_gb/home.html) -- a language (with visual front-ends) that almost nobody's heard of unless they work for a university, a Fortune 500, or as a statistician. It was invented in the 1960's, the inventor is still involved in running the company, and the core language has barely changed since its invention.

There's a SAS clone called WPS (https://www.worldprogramming.com/home) with associated tools that sometimes get mentioned in the same breath as KNIME and Alteryx, though I haven't tried them and don't know anyone who has, so I can't vouch for whether they're good or bad.

I was a professional SAS programmer in my early days and still occasionally interact with that world, so I can vouch for its capabilities. Its focus is heavily on the analysis end, with complex multivariate inferential statistics and visualisations handled in a few lines of code. It does the usual pipeline work too, but that's almost incidental.

The folks who work in this area often know SQL and the relational modelĀ very well -- it's at least part of their educational background and SQL is used a lot -- but strictly for storage and retrieval, necessary at the input end and maybe a required output target, but generally considered neither interesting nor of broader value.

Regarding the klunky old-skool Java user interfaces in KNIME & similar products... In the big enterprise world where these tools are mainly used, there's a curious disinterest in UI/UX quality. On the contract I'm currently doing, I'm given some new (old) tool with a Java UI on an almost daily basis. They invariably have a rough Swing interface, but nobody cares -- these are industrial tools, meant to solve a problem; not be pretty.

Each tool is used briefly for some specific purpose -- retrieve a record from system 'x' using Tool A, decode it from binary to show field 'y'; then insert the value into the message bus cache using Tool B or whatever -- and then not used again for six months. Or six years. It's far more important that the thing just work when you fire it up again six years from now, on whatever OS you're using at the time, which has a roughly equal chance of being Windows, MacOS, or Linux.

If you have to spend much time in a given tool, it's likely that someone will hammer out some code to automate that manual process, and then you won't have to do it at all.

Of course, that new automation will require an administration or monitoring utility, which will be written quickly in Java with a Swing UI so it runs easily on Windows, MacOS or Linux. No effort will be made to make it look pretty because you're only meant to spend a minute or two in it every six years. If you have to spend more time than that in it, it's likely that someone will hammer out some code to automate that manual process...

And so on.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org