Schematic representation of Paris avenues lengths

julien's Avatar

julien

03 Sep, 2022 06:14 PM

Hello,

For my second project with Nodebox, I wanted to try to work with data. Paris city have a good collection of open datas, and I started by something simple = streets with their names and lengths.

After a little bit fiddling I founded a pleasant representation : Take all Parisians avenues, sort it alphabetically and display a line representing their lengths. Place circles for scale and measurements in the inner circle.

I made a maximum of parametric nodes, so if I want to have streets or boulevards instead of avenues I just have to change the lookup and some text values (I will post this later)

The image is huge, so I made some zooms (attached).

todo :
- Use the node made by John for centering the measure text on the circles
- Tidy the nodes
Find how to switch the lookup key and replacement text quickly

Questions :
- Is it possible to round a number on an arbitrary number of decimals in Nodebox?
- Is it possible to convert meters in kilometers conditionally ? I tried a condition and then divide by 1000, but it made Nodebox crash every time, maybe there were too much data
- Is there tips to help Nodebox with the render? If I put streets instead of avenues, it's extremely laggy, and unusable on my laptop (ok-ish on my desktop)
- Can I use Hershey fonts (stroke fonts) in Nodebox, and if yes, how?
- I wanted to have a "switch" containing a key, an singular, and a plural word (like : AV ; Avenue ; Avenues), for rapidly switch witch streets I wanted. I made a table and… now I don't know hat to do with it :). I suppose chose an index and "distribute" the values of each key were it's needed, but I'm not sure how to do this.

I'm open to critics, ideas and suggestions!

  1. Support Staff 1 Posted by john on 03 Sep, 2022 10:04 PM

    john's Avatar

    Julien,

    This is lovely! You are learning fast!

    Some answers for your excellent questions...

    Is it possible to round a number on an arbitrary number of decimals in Nodebox?

    I have a node for this: precision. You can find it in the Cartan Nodebox Library. Here is the link: http://support.nodebox.net/discussions/show-your-work/493-cartan-no...

    Is it possible to convert meters in kilometers conditionally ? I tried a condition and then divide by 1000, but it made Nodebox crash every time, maybe there were too much data

    This should be quite easy and should not crash even with tens of thousands of items. I suspect the problem may be the switch node. Because the switch node treats each option as a list it is easy to inadvertently create lists of lists when you use it. So if you feed it a thousand items, instead of producing a thousand outputs, it may produce a thousand thousand (a million) outputs.

    I have another node in my node library to solve this problem: this_or_that. You can find it at the bottom of the first column of math nodes (use link above). Give that a try and see if it solves your problem.

    Is there tips to help Nodebox with the render? If I put streets instead of avenues, it's extremely laggy, and unusable on my laptop (ok-ish on my desktop)

    General tips: if Nodebox bogs down or crashes, go back and render each node till you find which one is causing the problem. Switch from Viewer to Data view in the main panel to see how many items the node is outputting (scroll to the bottom to see the count). If you expect a thousand lines but see a million lines, find out why. When debugging, use a slice node upstream to work with a small number of source items then, once everything is working, increase the amount of data coming in to see at what point the network starts to bog down.

    If you are doing animations I have some other techniques I've developed over the years. You can save expensive upstream calculations as CSV files and then simply import the data from the file instead of computing it for each frame. Switch nodes can be expensive because Nodebox evaluates each option even it it's not chosen; there are ways to mitigate this.

    In this case, though, you may have run into a fundamental limit. Nodebox can efficiently plot a million rectangles, but will start to bog down sooner with text paths. Depending on the average length of the text strings, it should be able to handle ten thousand text paths, but may start to slow after that. Nodebox does not have a text object so has to work with (and store) the actual vector paths; if you export a design with tens of thousands of text paths you will wind up with a very large SVG file.

    Again, you should switch to data view and count the items to make sure you are only trying to plot a thousand street names instead of a million. If the number of items is correct but Nodebox is still slow, one thing you can try is using a simpler font. San Serif fonts can have dramatically fewer articulation points in their paths. Nodebox has a built-in SanSerif font (at the top of the font list) that is the simplest possible - but of course it doesn't look as nice so may not be acceptable in a case like this.

    In a desperate situation, one fallback is to break up your design into multiple pieces and plot each piece separately - painful but possible. If you get stuck you can send me a zipped version of your project (with data) and I can see if there's anything else I can do.

    Can I use Hershey fonts (stroke fonts) in Nodebox, and if yes, how?

    Guess what? There is indeed a node for that in my Cartan Node Library (link above). The node is called text_svg and requires a slightly modified version of a vector font. Two such fonts, HersheyFont.svg and relief_font.svg are included with the library. If you need to use a different vector font please let me know and I can help you modify it so that the letters will align correctly.

    I wanted to have a "switch" containing a key, an singular, and a plural word (like : AV ; Avenue ; Avenues), for rapidly switch witch streets I wanted. I made a table and… now I don't know hat to do with it :). I suppose chose an index and "distribute" the values of each key were it's needed, but I'm not sure how to do this.

    This is definitely doable using subnetworks. I would need to see exactly what you are doing to explain how. If you are still stuck, send me your project (or a stripped down version of it with enough data to test with) and I will show you how.

    Thanks for sharing this beautiful work and for asking such good questions. Keep it coming!

    John

  2. 2 Posted by julien on 04 Sep, 2022 09:55 AM

    julien's Avatar

    Thank you for your kind reply. I will tidy my file, do some testing and refine, and send it to you if I can't solve some of my problems!

  3. 3 Posted by julien on 04 Sep, 2022 02:03 PM

    julien's Avatar

    Ok, I tried something for the data filter, it works well but I think that there is room for improvement.

    Also, I don't know how to group all the selector in one node, because it have 4 output and I believe that a children node only output its rendered node. I guess I have to move the lookup node where it's required on the parent.

    I attached the csv and the nodes if you want to check it.

  4. 4 Posted by julien on 04 Sep, 2022 09:29 PM

    julien's Avatar

    I used the technic of the previous post for populate my dataviz (with a CSV instead of a table), but know I wonder if instead of index I can just enter the short name key like "AV" or "BD" to select the right CSV line. I thought it was simple but I lose myself :).

  5. Support Staff 5 Posted by john on 05 Sep, 2022 03:12 AM

    john's Avatar

    Julien,

    This looks fine to me. It's straightforward and efficient - and it works.

    The data file only has about 6500 lines - fairly small for Nodebox data. But the file itself is rather heavy - almost 4 megs. The reason is all the columns, especially the Geometry column, which contains the actual shape and coordinates of each street - in fact you could use that column to draw a map of Paris entirely in Nodebox!

    You don't need all that data for this project. If your full project is running slow you could probably speed things up considerably by creating a subset data file in Excel: the same rows but only the columns you need (C_DESI, LENGTH and L_LONGMIN).

    One small improvement. For your Lignes subnetwork you import all the data twice (two lines from the same Datas node). This is untidy and might affect performance if you have a lot of data. Instead you should place a single null node inside your subnetwork (rename it "data" if you want) and publish that as your only data parameter. Then do the lookups for LENGTH and L_LONGMIN inside your subnetwork by feeding that null node into the lookup nodes. Just pass in the data once.

    The point you make in your final note it is correct: you only need the short name key to select the right subset of your data, and you can derive the other characteristics from that downstream. So instead of pulling out the characteristics (singular, plural, feminine) on the main level, you could pass the entire characteristics table (using a single null data node) into whatever subnetwork needs them, and pull what you need inside the subnetworks by doing lookups. This is slightly less efficient but shouldn't be noticeable since that characteristics table is so small. And it would make the main level a little cleaner.

    I notice that there are actually 48 different types of street in Paris. In addition to AV, BD, PL, and VLA there are 3455 RUE, 425 VOIE, 308 PAS, etc., and two lines in the file with no street type. You can easily extract all this using the DISTINCT node on the C_DESI key. But again: I strongly recommend using a slimmed down data file with only the 3 columns you need; once you do that everything will go much faster.

    Are you planning on visualizing all of these types? You could feed in the entire data table to make one big circle with all 6542 streets. You'd have to use very thin lines with a font size too small to read, but it might be possible. (Without the street names it would definitely be possible.). You could color code by street type if you wanted.

    Or you could do a trellis visualization with multiple smaller circles. You could lump at least half of the obscure types into an "Other" bucket (there is only one instance of types BAS, ARC, GAV, etc.). You could also calculate the average length for each street type and visualize that (not sure how interesting that would be).

    Anyway: well done! This is a really interesting project and you seem to figuring everything out just fine. I'd be curious to see your final results if you decide to pursue this further.

    John

  6. 6 Posted by julien on 06 Sep, 2022 07:39 AM

    julien's Avatar

    Hello,

    Thank you for your reply. In my final project I have already stripped the csv file (and added some columns for human readable length, in french, with kilometers conversions if it exceeds 1000 m (I did it in excel because I can't figure out how to do it with nodebox at the time). I also deleted the "boulevard périphérique" line as it's the 70km highway that circle Paris, and wreck my representation :).

    Thank you for the null node tip, I always forget about it. I tidied and commented my nodes and I will do further improvements to use lookup nodes inside child nodes.

    For the moment, I pull the characteristics by index but I will see if I change to pull out by short key (so I know that I display "avenues" when I type "AV" instead of choosing an numbered index like "2").

    I thought of the distinct node to pull all the street types, but I had to add singular/plural/gender for each, so I don't know if it's worth it.

    For the moment, I did another CSV with all the interesting types, filtering out the one with few streets, and I select the type by index (with an integer node + slice node).

    I think that all the streets in the same time would be unreadable. Even with only "rues", the graphic is huge. (see screenshot below). Withe the rues displayed, Nodebox is barely responding on my laptop. I wonder if I did something wrong or if it's due only to the number of text path displayed.

    I did a "calculator" node to calculate median, average, min and max streets length, but not displayed it for the moment, as I thought it was too much.

    I also forgot to say that the scale is dynamic and take the size of the longest street. I can also change the scale and put 100m increments instead of 500 for example. Even the label on the scale change. Although it was not to hard simple, I'm proud of me on this one :).

    I will definitely try to do another dataviz on the same model, with all the 6000+ streets, without names, just the lines, colored by type. Although it's a bit frustrating to not be able to see what is this huge or tiny street represented on the circle. Maybe I'll do an index on the bottom.

    Find attached representations for some types of streets.

    Last question: Can I change the document properties (like size) with a node?

  7. Support Staff 7 Posted by john on 06 Sep, 2022 11:15 AM

    john's Avatar

    Julien,

    I am really liking this project.

    Re your final question: sorry, there is no way to dynamically set the canvas size with a node; you have to do it manually.

    I do have a node in my library, canvas, that can read the current canvas size so you can do calculations based on that. This node requires a python code library module (included of course).

    Keep up the good work!

    John

  8. 8 Posted by julien on 06 Sep, 2022 12:25 PM

    julien's Avatar

    Thank you, you really have a node for every situation! :D

  9. 9 Posted by julien on 07 Sep, 2022 08:42 AM

    julien's Avatar

    Hello, it's me again :).

    I try another representation with the same dataset, mixing street types, and I have few questions:

    - Can the filter node accept multi parameters like "AV;BD;RUE" ? (I think not)
    - I made multiple filter nodes for each type, it works, but it's sequential and I lose the alphabetical order. How can I re-sort all the point by alphabetical order from one of the CSV column ?

    See the file attached : blue are boulevards, red avenues and green streets, displayed sequentially.

    I understand why, but I can't wrap my head around a solution, as I lost the data ensemble after the filters. I can pass name in each "voie" node, but I don't now how to use it for sorting after that, as it's not data anymore.

    Maybe I'm totally headed on a wrong direction and I have to keep the data together and find another solution for shape/color variation in the "voies" child node.

  10. Support Staff 10 Posted by john on 07 Sep, 2022 10:21 AM

    john's Avatar

    Julien,

    No, you cannot filter my multiple parameters at the same time. But there is a way...

    You were on the right track. Step one is to filter for each parameter as you did. Each filter node will output a subset of the data table. Then all you have to do is feed your three subsets into a combine node (BEFORE doing anything else). The result will a data table containing only the rows for those three values.

    NOW you can sort on the combined subsets to your heart's content. Once you have the data rows in the order you want, THEN you can extract what you need to form the color-coded representations of street lengths (or whatever).

    Does that make sense?

    John

    P.S. Instead of hard-coding three subsets, it is also possible to do the same thing in a more generic way by making a very simple subnetwork to form the subset data table. In fact that subnetwork is just a filter node with one published parameter being the full data (taken as a list) and the other published parameter being the list of key values (taken one at a time). You can feed in an arbitrarily long list of the street types you want and the subnetwork will filter for one subset at a time each time it fires. The output of that subnetwork will be the combined set of data rows for all those types - which you can then sort before extracting what you need. Let me know if you need me to make a demo of this technique.

  11. 11 Posted by julien on 07 Sep, 2022 01:05 PM

    julien's Avatar

    Thank you for your reply, I'm not sure to understand the last point, and I thinks that because I don't have fully grasped the "Nodebox zen" of creating list and when/how it fires, and how subnetworks change this.

    I think I succeeded to make it work, but without really understand why. Can you please explain to me step by step what this is doing?

    What I think I understood: as it's have one set of data and 3 strings (RUE, BD, AV) to filter, it parse and filter against the first filter (RUE) then BD, then AV, and the output are these 3 processing back to back, with all other datas retained, so I can filter it by whatever after this? Am I correct ?

    I'm lost because I thought that when the size of 2 sets don't match, there is a sort of repetition (like the shapes in the tutorial). I really have to work on this! :)

    Also, I took the type C_DESI to test this and name the points, but I don't know how to do this with something different that text already in the data. I think I have to test (with switch node ?) the C_DESI and apply something to this, but not sure how to proceed.

  12. Support Staff 12 Posted by john on 07 Sep, 2022 09:51 PM

    john's Avatar

    Julien,

    It looks to me like you've got this, though it's hard to tell without your actual .NDBX file.

    My previous explanation was more complicated than it needed to be. I forgot that the Filter node is already set up to read the data source as a list and the key(s) as value(s). So you don't even need subnetworks. What you have is fine as is.

    When you asked if you could enter "AV;BD;RUE" as the key in a filter node, I should have said YES. That is, you can't type that into the key port as a single string, but you CAN do the equivalent: type it into a make_string node and feed the output of that into the key port of the filter node. This will send a list of three keys into your filter node, which is exactly what you want and exactly what you have already done.

    The "Nodebox Zen" is indeed subtle and it took me a long time to get used to it. A few key facts to review:

    • Everything in Nodebox is a list (sometimes a list of one item)
    • A data table is a list of rows
    • Lists coming into a node can be read one at a time (range = value) or all at once (range = list)
    • You can determine the range of each parameter by using the metadata dialog
    • The rule about incoming lists of different sizes causing repeats only applies to parameters with range = value (one at a time).
    • Regardless of how many times it fires, a node always outputs a single list.

    With that in mind, select your filter node and open the Metadata dialog (click the link atop the parameter pane). You should see 4 parameters (Ports) on the left: data, key, op, and value. Select each port and notice the Range setting on the right. The Ui allows you to change this - very handy for subnetworks - but this won't work on standard nodes like Filter.

    Notice that the data port is set to Range = List and the key port is set to Range = Value. This means that the entire data port will be read all at once regardless of how often the filter node fires. And since key is set to value, the filter node will fire once for each item in the list of keys.

    So what happens when you hook your 6500 row data table into the data port of your filter node and hook a list of 3 strings (AV;BD;RUE) into its key port? Filter will fire 3 times.

    The first time it fires it will output all rows from the table where street type = AV. The second time it will output all rows where street type = BD. The third time it will output all rows where street type = RUE. But you don't see three separate lists, just one list with the combined total of all three firings.

    So what you have now is a data table just like you had originally, but instead of 6500 rows there are now like 800 or something. But everything works as before. You can sort this new table, filter it, etc. So you can sort your table by street name if you want, THEN turn each row into a line or a dot or a word or whatever. This is what you have already done as far as I can tell.

    Is it starting to make sense now?

    As for your final question, I'm not sure I understand. Do you want to take certain type values like C_DESI and convert them to something more human-friendly in your display?

    If so, there is an easy way of doing this without using a switch node. When you look up the street type for each row, before feeding it directly into a textpath node for display, first pass it through a replace node. Set Old to C_DESI and New to your human-friendly equivalent, then feed that output into your textpath node. If the type is anything other than C_DESI it will ignore it and pass it through unchanged; but if it is C_DESI it will change it. If need be you can string multiple replace nodes together each feeding into the next. Easy!

    Let me know if you have any more questions.

    John

  13. 13 Posted by julien on 08 Sep, 2022 02:50 PM

    julien's Avatar

    Ok, thank you for the explantation, it makes more sense even if I think I will have to process it again many times.

    For my second question, I want to display a triangle for rues, a circle for avenues, and an square for boulevards, like 4 messages above, but sorted like my last screenshot.

    For the moment, I can display the C_DESI value of each because it's in the table, but I don't know what is the best method to replace it with shapes.

    I succeeded last time because I have one node for each C_DESI, each one outputing the correct shape, but now I don't really know how to do it.

    Thank you for the texte replacement tip, it can also be useful.

  14. Support Staff 14 Posted by john on 09 Sep, 2022 08:45 AM

    john's Avatar

    Julien,

    There are many different ways to convert a variable like street type into a shape. See attached demo for one such way.

    For this demo I create a random list of street type strings, rue, avenue, or boulevard. This corresponds to the C_DESI value you would look up from your table. Then I feed that ordered list of strings into a subnetwork called make_shape.

    Make_shape outputs a list of colored shapes in the same order as the input strings. To show how you might use this I then generate some random numbers (simulating street lengths in the same order you pulled the street types) and use those values to draw the shapes radiating from a central circle of a defined radius.

    The magic happens inside that make_shape subnetwork. Open it up to see how I did it.

    It's just a switch node with the three possible shapes as options and an index value based on the input string. The index will be 0 if the type is rue, 1 if the type is avenue, and 2 if the type is boulevard.

    The only tricky bit is converting the type string into the index value. I use two equal nodes. An equal node returns a true if the string matches and false if it does not. But if you feed those boolean values into an add node, it will treat true as 1 and false as 0.

    • if the type is rue, both tests will fail so the sum will be 0
    • if the type is avenue, the first test will yield a 1 so the sum will be 1
    • if the type is boulevard, the second test will yield a 1 which is doubled so the sum will be 2

    There are many other ways of doing this, but this works fine for this situation. You should be able to use a modified version of the make_shape node in your project.

    Happy Nodeboxing!

    John

  15. 15 Posted by julien on 09 Sep, 2022 03:58 PM

    julien's Avatar

    Ok thank you, I understand, but it's not very scalable and a little bit too far fetched for me :).

    It's still unintuitive for me that there is no "if/then" node and that we have to fiddle with switch node, mathematics and ruse to treat different datas :).

    Can't we just take the index of whatever value of the table? Like a lookup on implicit index? I tried to reproduce this with a table (1 row with C_DESI and another just a range of values used as index (duplicate of the true index but I don't know how to access it). Even like it we will be limited to 6 values (6 input ports of switch node). I didn't succeed with the table because I don't really understand how the distinct node works and what "key" is expected.

    I will test other things but I think that I don't have really figured out how everything work. I will work on it!

    Side note : I'm also always confuse by the "upstream" organization of certain nodes like the switch or the position, it's just a matter of habit but I always want to put the switch alternatives or the position of the object on the bottom.

  16. Support Staff 16 Posted by john on 09 Sep, 2022 08:05 PM

    john's Avatar

    Julien,

    I made that hard-wired node with the mathematical trickery because I wanted to show you that technique. It often comes in handy. But you're right: it's kludgy and does not scale.

    So see the attached demo with a much cleaner more generic solution.

    This solution features a new node: symbol_table. Symbol_table takes the following parameters:

    • a data table
    • a key in that table holding values you wish to symbolize
    • the name of a new symbol column
    • a list of string values you wish to symbolize
    • a list of symbols (can be shapes or strings or numbers or anything)
    • a null symbol for values not included in value list (can be anything)

    Symbol_table takes all this and outputs a new version of your original data table with a new symbol column. You can now slice and dice this table however you wish and, when you're ready, extract the symbol values for drawing your visualization or whatever. Very flexible!

    In this case I used a stripped down version of your actual Paris street data with the three street types and shapes from the previous example. If you render the symbol_table node and put it in data view you will see the original data table with a new column at the end, SYMBOL, which holds a path for each type of street: triangles for RUE, circles for AV, squares for BD, and the null symbol, an invisible zero-sided square, for every other type.

    For this demo I then take this table, filter it down to those three types, sort by street name, and plot the symbols around a circle. Notice that I didn't have to filter the table; if you connect symbol_table directly to the null1 node you will see a visualization of all 6542 streets. In that case other street types have rays showing length but no visible symbol.

    Aside: In the filtered viz shown in the screenshot you can see a cluster of blue circles at about the ten o'clock position. I was curious so looked in the data: these are a series of forty avenues that begin with the word gate (Porte): Porte Brancion, Porte Brunet, Porte Chaumont, etc. Cool!

    You can visualize a different subset of street types by simply changing the list in the make_strings1 node. Of course you don't have to use triangles and circles; you could use the letters A to Z either as strings or as text paths already pre-rendered in the font of your choice, or wingding font symbols, or whatever you want. And it's pretty fast. On my laptop it renders the full viz for all 6542 rows in one second.

    Perhaps I should have given you this demo from the outset. But I think we both learned more with the back and forth we've had so far.

    And now on to your aesthetic objections to the lack of IF/THEN nodes...

    Over the years I've coded in dozens of different languages and almost all of them shared the fundamental constructs of loops and branches. So for me too it was a shock when I first encountered Nodebox. At first I couldn't even see how to implement a looping algorithm without loops, and using switches instead of IF/THEN branches felt like coding with one hand tied behind my back.

    But over time I saw the simplicity and beauty of Nodebox and came to appreciate the advantages of its pure functional approach.

    Every Nodebox node is a function which takes any number of inputs and always produces a single output. As a result the final structure is always a perfect tree with inputs coming in as leaves at the top, then a consolidating cascade of branches narrowing to a final trunk solution at the bottom. Each network has a beautiful structure which is fully visible (once you tidy things up).

    If Nodebox allowed loops and branches the resulting structures would quickly devolve into a rats nest. This is one reason why other attempts at a visual language have failed: they're just too messy. But Nodebox, with its consistent downward flow, is so clean and predictable that you can construct even vast networks with thousands of nodes and still find your way around months later.

    Thinking in this pure list-based functional way does take a little getting used to, but once you make the adjustment it starts to feel natural. Debugging in particular becomes much easier than it is in "normal" languages like Processing. I also find it easier and more intuitive than Excel for pure data-munging tasks like making the simplified Paris Street data CSV from your original file.

    So I hope you'll stick with it. You seem to have the knack.

    John

  17. 17 Posted by julien on 13 Sep, 2022 09:41 AM

    julien's Avatar

    A little message to thank you for your reply, unfortunately my real job kicked in in the last days and I could not reopen nodebox for some days, I will test it as soon as I can.

    For this point :
    > I made that hard-wired node with the mathematical trickery because I wanted to show you that technique

    Yes it was clever and a good idea.

Reply to this discussion

Internal reply

Formatting help / Preview (switch to plain text) No formatting (switch to Markdown)

Attaching KB article:

»

Already uploaded files

  • Capture_voies_paris_01.jpg 102 KB
  • Capture_voies_paris_02.jpg 505 KB
  • Capture_voies_paris_03.jpg 277 KB
  • Capture_voies_paris_04.jpg 290 KB
  • Capture_voies_paris_05.jpg 496 KB

Attached Files

You can attach files up to 10MB

If you don't have an account yet, we need to confirm you're human and not a machine trying to post spam.

Keyboard shortcuts

Generic

? Show this help
ESC Blurs the current field

Comment Form

r Focus the comment reply box
^ + ↩ Submit the comment

You can use Command ⌘ instead of Control ^ on Mac

Recent Discussions

30 Sep, 2022 11:36 PM
17 Sep, 2022 08:43 AM
13 Sep, 2022 09:41 AM
08 Sep, 2022 08:08 AM
07 Sep, 2022 08:48 AM

 

03 Sep, 2022 04:11 AM
25 Aug, 2022 05:07 AM
21 Aug, 2022 07:08 PM
05 Aug, 2022 05:24 AM
02 Aug, 2022 06:57 PM
01 Aug, 2022 02:02 PM