Maximum Length String Trimmer (Plus Bug Report)

john's Avatar

john

07 Mar, 2016 07:44 AM

This is a simple subnetwork that may come in handy on many projects. Just feed it a string and a maximum length. If the string is less than or equal to that length it will pass through unchanged. If it is longer, it will be trimmed to that length with three dots appended.

The reason I am sharing such a trivial subnetwork is that, as I learned the hard way, doing it yourself may produce unpleasant surprises.

The normal way of doing this would be to just use the substring node. But it turns out substring has two "undocumented features". If you feed it an empty string it will trigger a division by zero error. And if you set the max length to a number greater than the length of the string, the length will wrap. That is, if you use substring to get the first 17 chars of "Hello World" the result will be "Hell".

So my trim subnetwork tests the length of the string and compensates for empty strings so that you don't have to. A screenshot of trim is attached along with the network itself.

NOTE TO NODEBOX STAFF

I've used substring functions in many languages over the years, and none have behaved this way. The wrapping behavior is especially insidious because it's easy to overlook. The output looks innocent enough at first glance but will trim some strings beyond recognition, seemingly at random.

So my bug report / feature request is that you improve substring so that:

1. Null strings pass through without triggering an error
2. If end value exceeds string length, substring will only return characters to the end of the string

Thanks!

John

  1. Support Staff 1 Posted by Frederik De Ble... on 07 Mar, 2016 07:39 PM

    Frederik De Bleser's Avatar

    OK that substring behavior is absolutely horrible!

    I've checked and the code was actually provided by an outside contributor (who shall remain unnamed :-). I think it is better if we simplify this code so it doesn't try to do any "magic".

    What I really don't like is that indexing starts at 1. However, I'm afraid it's too late to change that...

  2. Support Staff 2 Posted by Frederik De Ble... on 07 Mar, 2016 09:05 PM

    Frederik De Bleser's Avatar

    John do you actually use the subString node often? And do you use the end offset parameter? I didn't find any projects that use this, so I wouldn't mind changing the behavior.

    My idea to clean up this mess:

    • Make indexing start at 0 (same for characterAt node if possible)
    • Get rid of wrapping
    • If the end offset is too big, just clamp
    • If start and end offset are negative, indexing starts at the end (like Python) (this should work in characterAt as well)
  3. Support Staff 3 Posted by john on 08 Mar, 2016 05:21 AM

    john's Avatar

    Frederik,

    I don't use it very often. I occasionally use it to truncate long strings, or to reverse names (from last, first to first last). I found 12 occurrences in one large network, but that was the exception, not the rule.

    So personally I would be willing to adjust my past code for the sake of 0-index purity.

    As for the end offset parameter, I say toss it. I can see the temptation, but it seems inelegant to add an extra parameter just to add 1 to the endpoint. I prefer simple, consistent nodes to nodes that sometimes do one thing, sometime another based on an almost hidden setting - even if that means I have to insert an extra "add 1" or "subtract 1" node now and then.

    I like your plan (though would appreciate documentation especially for the negative offsets).

    Full speed ahead!

    John

Reply to this discussion

Internal reply

Formatting help / Preview (switch to plain text) No formatting (switch to Markdown)

Attaching KB article:

»

Already uploaded files

  • trim_subnetwork.png 40.1 KB
  • max_length_string_trim.ndbx 3.28 KB

Attached Files

You can attach files up to 10MB

If you don't have an account yet, we need to confirm you're human and not a machine trying to post spam.

Keyboard shortcuts

Generic

? Show this help
ESC Blurs the current field

Comment Form

r Focus the comment reply box
^ + ↩ Submit the comment

You can use Command ⌘ instead of Control ^ on Mac

Recent Discussions

18 Nov, 2024 11:24 PM
18 Nov, 2024 09:01 PM
07 Nov, 2024 10:53 AM
02 Nov, 2024 11:22 AM
01 Nov, 2024 12:41 AM

 

01 Oct, 2024 07:59 AM
30 Sep, 2024 11:37 PM
30 Sep, 2024 11:11 AM
30 Sep, 2024 02:37 AM
28 Sep, 2024 10:33 AM
26 Sep, 2024 06:41 AM