large DB visualization?
Hi all!
I have a large DB. 300 tables, 200 relationships, 3000 fields.
I would like to visualize it for a non-tech audience.
Possibly no need to show ALL the tables. Some tables can be grouped in "logical clusters", e.g. "Clients (19 tables)".
Is this something nodebox can be useful for?
Thank you!
-Geo
Keyboard shortcuts
Generic
? | Show this help |
---|---|
ESC | Blurs the current field |
Comment Form
r | Focus the comment reply box |
---|---|
^ + ↩ | Submit the comment |
You can use Command ⌘
instead of Control ^
on Mac
Support Staff 1 Posted by john on 16 Feb, 2017 11:18 PM
Hi Geo,
I definitely think NodeBox can be useful for this. I've been using it that way for two years now.
Depending on your background and experience, NodeBox may take a little getting used to at first. But I now much prefer it to other tools I've used, not just for visualization, but for "pre-visualization" and even raw data prep.
I've done many visualizations, sometimes with fairly large or complex datasets. Last week I read in an employee table with 146,000 rows, extracted a City field (the only geographical field available), connected it with a JSON lookup from a public Google mapping API, and quickly produced a world map of all sales offices. I've turned similar tables into beautiful org charts for organizations ranging in size from 100 people to 80,000.
Many projects require reading in multiple tables (CSV or JSON), extracting items from them, counting them, sorting them, merging them, comparing across multiple time periods, etc. and then producing clean, simple visualizations - anything you can imagine. I do all of this without writing a single line of code - just linking nodes together.
Some caveats:
NodeBox is not Tableau. You can produce most any kind of visualization, but you will have to roll your own charts from scratch. There is no plug and play library of line chart nodes, pie chart nodes, etc. But with a little time and practice, you can build up your own library. The good news is that this will give you total control over the final product.
NodeBox is Open Source, which means it's free and open, but also means that there is no 24-hour support line, limited documentation, and no guarantees that bugs will get fixed in a timely manner. That said, Frederik and team have been very supportive and generous with their time. And there are volunteers like me to help the newbies.
NodeBox makes it easy to view, modify, and create tables, but you cannot export them to CSV files. (It was created as a tool for artists, so this didn't seem important.) Fortunately, NodeBox is extensible and just last weekend I finally got around to creating a "node" that can output CSV files. I'm already finding that very useful and will soon share it in the "Show Your Work" forum.
NodeBox can produce still images (PNG, PDF, or SVG) or animations. But one thing it cannot do is create interactive visualizations. You can interact beautifully in the design environment, but you cannot deliver visualizations with buttons or filter fields or sliders that would allow your end users to do the same. I think this is NodeBox's single most painful limitation, and that one thing that makes me think about giving up on it.
You should also be aware that large NodeBox networks can devolve into impenetrable hairballs if you're not careful. I have created networks with two thousand nodes and hundreds of subnetworks that go seven levels deep - and I've been able to come back to them three months later and still understand them. So it can be done, but it takes discipline. In general NodeBox is more suited to smaller, simpler projects.
Performance-wise it can easily handle tables with 10,000 rows, but can get sluggish at 100,000 rows. So if you have millions of rows you will need to do some preprocessing before importing into NodeBox. It would probably also not be feasible to import 300 separate tables at once. But if you think clearly enough about what you are trying to visualize and break things into smaller, simpler projects, you shouldn't need to.
Hope that helps. If you do decide to try it and get stuck, feel free to post questions in the NodeBox 3 forum. And if you come up with something cool, please share it in the Show Your Work forum if you can.
Good luck!
John
2 Posted by Geo Artemenko on 23 Feb, 2017 05:51 PM
Hey John thank you so much for the reply.
What if I were to show the tables without records or even fields or relationships? Just 300 table names grouped by specific logic into clusters and filtered as necessary.
My understanding since there is no interaction I will need to export multiple outputs?
Support Staff 3 Posted by john on 23 Feb, 2017 07:53 PM
Do you mean that you want a bubble chart where each bubble represents a table? If so, that would be easy.
You could position the table names on the canvas in clusters, but if the names are long you might wind up with a big mess of overlapping text. You could represent each table as a rectangle and put each name in small print inside the rectangle so that people could zoom in to see the name only if they need to. You could then color the rectangles or vary them in size to communicate other qualities (e.g. make the size proportional to the number of records in each table, use color to show how frequently they are accessed or whatever).
The trick would be devising some method to determine how to cluster them into groups. If you have a discrete set of categories you could create an 9-box or N-box display. If you have a high-level diagram that shows hierarchical relationships you could import that as an SVG, ungroup it, and use it to define where to place each cluster on the screen.
Once you have the table rectangles arranged on the screen, it would also be easy to connect some of them with link lines (which could also vary in width or color).
The bottom line question is: what exactly are you trying to convey? What do you want people to understand about these tables? One thing you could do is make a crude paper sketch of the ideal visualization and share that.
NodeBox is a great tool for this sort of thing. You will probably need to create a spreadsheet (csv) with a list of your 300 tables and whatever high-level properties each one has. Nodebox can then read that in, use some logic to calculate a position for each table representation, and draw them all. You can then fiddle with various parameters to vary the layout until you get it just right. Then save it as a PDF, PNG, or SVG. (It is also possible to make an animation if you can figure out what you would want to animate.)
If you need to create a whole bunch of different outputs (e.g. one for each of your 300 tables), you can draw each chart on a separate movie frame and then use the Export Range option to save each frame as a separate file. Viola! 300 charts with a single click.
If you want to take a stab at it and come up with a csv describing your tables and a sketch of what you want I can help you work through how to do it in NodeBox. Sounds like a fun project!
John
4 Posted by dylanomran2 on 13 Jul, 2017 09:48 AM
Hello geo,
I'm sure by now you've come up with a great Nodebox solution. As an alternative for this task I suggest trying yEd graph editor (or the yWorks libraries if you're more technical).
I've produce many large diagrams of related items using yEd. You create an Excel file with two sheets. One listing nodes, the other relationships. Import that into yEd and you can then use the many layout algorithms, and the Properties Mapper to modify your nodes and/or relationships based upon data from the Excel tables. The process I've described takes minutes.
yEd offers another advantage over Nodebox; being able to simply draw, modify, create etc. So whether you start from scratch or want to modify an import you can just add, delete, re-label, modify, move etc. etc. quickly by direct interaction with the visual. For example I've used it to do the sort of diagram you're doing in meetings as the discussions unfold, because it's quick to use.
yEd exports to PDF and SVG, so you can probably come up with hybrid yEd-Nodebox solutions should you need to.
I hope that helps. If you're interested I can offer help/examples.
Dylan