Table Nodes

john's Avatar

john

22 Feb, 2020 05:06 AM

UPDATE: I have improved some of the nodes covered in this post. See further down in this thread for my update post with the improved nodes attached.

Tables are extremely useful for data analysis, visualization, and Nodebox projects in general. I included several table-related nodes in the first release of my node library. Here are four more that will also be included in the next release:

DROP_COLS

Pass this node a table and a set of keys. You will get a modified table with the columns for those keys dropped OR all columns dropped except those keys. This is a convenient way to reduce an unwieldy CSV file with many columns down to a more manageable size. With large files this can also improve performance.

You can also use drop_cols to REORDER the columns of a table. To do this, get a list of the table's keys (using the keys node), reorder that list (e.g. sort it), then feed the original table and that list to the drop_cols node and set the option to "Keep".

CHANGE_COL

You can use this node to change the contents in one column of a table, change a column header (key name), or add a new column before or after any existing column.

There are many situations in which changing the contents of an existing column is useful: rounding values or changing their numerical precision, abbreviating or clarifying labels, changing case, making timestamps more readable, replacing values with percentages or some other function, etc.

Change_col can also help with sorting issues. Sometimes numerical columns do not sort properly (e.g. sort 10, 11, 145 before 2). This happens when NodeBox interprets the column as strings instead of numbers. Using Change_col you can convert to strings by simply doing a lookup and feeding that into a number or integer node; the table then can then be sorted properly.

TRANSPOSE

The transpose node allows you to specify a range of columns and consolidate those into a single new "dimension". For example, you may have a table with separate columns with sales data for each month of the year. Using transpose, you can replace those 12 columns with a single new "month" column. New rows will be added for each month, repeating as subheadings after any existing row headings.

Transpose does not alter or drop any values in your table; it just moves them from columns into rows. This can be surprisingly useful. It allows you to reorganize your data and filter it in new ways. It can also be used with the summarize node to "pivot" a table from one view to another.

SUMMARIZE

Summarize is probably the most powerful and useful node in this collection. It summarizes an existing table by grouping together values in repeating categories and totaling them. It provides three types of totaling: sum, average and count. Sum is the normal form of totaling amounts. Average is appropriate for values like price or a person's height. Count can be used to simply count the occurrences of each row in the same grouping.

An example of count totaling is a character occurrence table showing the number of times each letter of the alphabet occurs in a given text. Enter your text into a characters node. Feed the output into both ports 1 and 2 of a table node; set port 0 to character,occurrences. Then feed that in to a summarize node. Set Category key to character and Value key to occurrences (leave Partition key blank). Set Method to Count and leave Precision at 0. Voila! A character occurrence table with only three nodes.

Sum and Average totals require numeric values under the Value key in the source table. You can set the desired number of decimal points for these totals using the Precision setting. Count totals do not require numeric values. When counting you can set the Value key to any key in the source table other than the Category key.

The Partition key is optional. If you leave it blank you will get a two-column table with category labels in the first column and totals in the second. If you provide another key from your source table as a partition, it will create new columns for each distinct value under that key heading and partition the totals under each of those columns. Using partitions you can create many different summary tables from the same source table (see screenshot).

Summarize is useful in and of itself. It's a great way to gain a high-level understanding of any new data file you import. It allows you to pivot data into different views. But this node is also designed to be used to create charts (line charts, bar charts, area charts, etc.). I am currently working on a set of chart nodes that take summary tables as inputs. I hope to release these soon with the next update of my node library.

Please give these nodes a spin and let me know if you have any questions or find any bugs.

  1. 1 Posted by Alexander Gogl on 22 Feb, 2020 08:53 AM

    Alexander Gogl's Avatar

    Wow, nodebox is very flexible. Have you created the manipulation nodes only by using nodebox standard nodes or have you used Python? I wonder if it is possible to use python packages like pandas or numpy inside nodebox. That would lift the processing of data to a novel level.

  2. Support Staff 2 Posted by john on 22 Feb, 2020 11:34 AM

    john's Avatar

    Alexander,

    All of these nodes are subnetworks, created using only NodeBox standard nodes - you can open them up and see for yourself. If you check out my node library you will see that a half dozen or so of those nodes were built using Python.

    In theory you should be able to use packages like pandas or numpy from NodeBox as long as they are installed on the user's machine, but there are some wrinkles due to the older version of Jython NodeBox is built on. See this (old) note for more clues:

    http://support.nodebox.net/discussions/general-discussion/14716-pan...

    My preference is to do everything in pure NodeBox unless an external language is absolutely unavoidable. I do pretty much all my data analysis and visualization work using NodeBox and have built networks with thousands of nodes. NodeBox is definitely quirky - and is showing its age - but it is enormously flexible if you have the will to persevere.

    The chart nodes I am building are already starting to be quite useful. I can turn out some decent visualizations in minutes by snapping a few nodes together. I remember that you were doing some nice radar charts a few years ago. Are you still using NodeBox? What have you done with it lately?

    John

  3. Support Staff 3 Posted by john on 07 Mar, 2020 09:59 PM

    john's Avatar

    UPDATE

    I have made improvements to two of the table nodes described above, and added a new one.

    CHANGE_COL

    The original version of this node required you to provide a list of values to the new values port. Now, if you leave this port empty, the node will use the values contained under the key specified as the Current Key. This makes it easier to rename a column header: just choose Replace, then enter the current name and the new name; no need to do a lookup to supply the existing values. You could also leave the new data port empty and use the Add Before or Add After option to create a duplicate column (though I don't see much use for that).

    SUMMARIZE

    1. I added a new method to use when summarizing. In addition to Sum, Average, and Count, you can now choose to Count Distinct. If you had a table of customers grouped by state and city, you could use Count Distinct to produce a summary table of how many different cities occurred in each state.
    2. The value key is now optional when you use the Count method. If you just want a count of how many items fall under each category in your category key, you can simply leave the partition and value keys blank. This avoids having to use or create a second key just to get such a count. If you leave the value key blank and use any other method you will get nonsensical results (all 0s for Add and Average, all 1s for Count Distinct).

    SUM_COLS

    Sum_cols is a variant of the Summary node. It has no option to create partitions from a partition key. Instead it allows you to enter a comma-separated list of existing keys for the Value Keys. It will then create a summary table with subtotals for each of those existing keys. This allows you to work directly on a table without having to use the Transpose node to create an interim table that the Summary node could use. By choosing different category and value keys you can make a wide variety of different summary tables from the same source table (see screenshot for one example).

    Revised Table Nodes network attached. Comments welcome!

    Enjoy!

    John

  4. Support Staff 4 Posted by john on 15 Mar, 2020 03:34 AM

    john's Avatar

    ANOTHER UPDATE

    I made a minor improvement to the drop_calls node.

    Now, instead of feeding a list of keys into the Column keys port, you can just type in a comma-separated list. This is more convenient when you only have a few columns to drop or keep (which seems to be most of the time),

    Screenshot and updated version of the table node demo attached.

    John

  5. 5 Posted by Alexander Gogl on 30 Mar, 2020 09:14 PM

    Alexander Gogl's Avatar

    Ah and here is an image of the print sheet of the last version of the cards: I've improved the pattern quite a bit :)

  6. Support Staff 6 Posted by john on 30 Mar, 2020 11:24 PM

    john's Avatar

    Alexander,

    Interesting! Just curious: did you use any of my table nodes to help make your cards?

    BTW, the table nodes (and many other new nodes) are included in the latest release of my Node Library:

    http://support.nodebox.net/discussions/show-your-work/372-cartan-no...

    John

  7. 7 Posted by Alexander Gogl on 31 Mar, 2020 07:58 AM

    Alexander Gogl's Avatar

    No, because I have started working on it before you've published your nodes, but I am going to try it at my next project :)

    john <[email blocked]> --- 2020-03-31 Tue 01:24:

Reply to this discussion

Internal reply

Formatting help / Preview (switch to plain text) No formatting (switch to Markdown)

Attaching KB article:

»

Already uploaded files

  • drop_cols_screenshot.png 434 KB
  • change_col_screenshot.png 449 KB
  • transpose_screenshot.png 446 KB
  • summarize_screenshot.png 543 KB
  • table_nodes.zip 37 KB

Attached Files

You can attach files up to 10MB

If you don't have an account yet, we need to confirm you're human and not a machine trying to post spam.

Keyboard shortcuts

Generic

? Show this help
ESC Blurs the current field

Comment Form

r Focus the comment reply box
^ + ↩ Submit the comment

You can use Command ⌘ instead of Control ^ on Mac

Recent Discussions

05 Apr, 2020 10:29 PM
02 Apr, 2020 07:12 PM
31 Mar, 2020 07:58 AM
31 Mar, 2020 04:13 AM
31 Mar, 2020 03:23 AM

 

27 Mar, 2020 03:55 AM
27 Mar, 2020 03:25 AM
15 Mar, 2020 09:06 PM
15 Mar, 2020 12:34 AM
14 Mar, 2020 11:59 PM
11 Mar, 2020 06:48 PM