pleroma.debian.social

pleroma.debian.social

"$(printf '%s' "$1" | sed -E 's/^0+//')"

death to 0XX octals, stupidest idea ever

@navi 😭 I love them. But you can also do this direct in shell without external commands if performance matters.

@dalias i dislike the inconsistency (and this kind of issues), we have 0x, 0o would've been more natural (and it's what c is going for now iirc)

i do accept suggestions of a posix shell-native way to do that!

for context i'm working with a file path structure that goes `$subdir/%03d` and often need to hold the inner id in a variable

@navi "${1##+(0)} in ksh or bash with shopt -s extglob

in sh I generally would use a loop over forking thrice and spawning not so portable things, have more ideas tho

@dalias @navi will gladly offer many suggestions, but preferable in an hour or so, not from the phone, k?

@navi ${x#0} should peel one zero. Repeating this N times even for decently large N is cheaper than a fork+exec. There's probably a better way too.

If you know it's 3 digits for example, $((1$x-1000)) works.

@dalias @navi oh yes, done the latter too

got more, later

@dalias from the structured data it’ll always be 3 digits, but user input not always

looking around, i’ve just found out "${x#${x%%[!0]*}}" which by what i can tell should be portable

@mirabilos posix shell here, though nothing there is supposed to be unportable (note that posix.2024 does specify -E for sed)

however, https://social.vlhl.dev/notice/B5pmpbtpl2X7xTZylM

@navi @dalias sh is such an elegant and readable language

@ska @dalias i consider this a bit of a hack mostly

for redability, ksh/bash’s extglob fairs a lot better, ${x##+(0)} but extglobs are not posix and i don’t have the brain capacity rn to think if they should be or not :V

@navi @ska They shouldn't be. They're a very different language strength and one that doesn't admit efficient matching without precompiling like regex.

@navi @dalias if it’s numeric, otherwise you probably want "${x#"${x%%[!0]*}"}". I’ve used this without nesting (i.e. with an intermediate variable) in the past but cannot tell you why… perhaps I needed the cut-off part in most cases anyway?

Anyway, here are more:

If you’re doing arithmetic on it anyway, i.e. know it fits the range (2³² in mksh, C long in lksh and other POSIX shells), and you know it’s a number, you can probably get the best performance on any kind of ksh-ish by doing:

if test -n "$KSH_VERSION"; then
        x=$((10#$x))
else
        # whatever you do for sh
fi

(Or, actually, define separate functions, if this is called often.)

So far, all the sh examples and the extglob one are unsafe if the value is literally 0. I assume you want to keep 0 as valid value.

The loop I mentioned and what dalias likely meant could be:

while :; do
        case $x in
        ([1-9]*|0) break ;;
        (*) x=${x#0} ;;
        esac
done

Forking thrice and likely execing twice (printf is often not a builtin) and being unportable (printf only arrived recently in many OSes, and the addition of -E to sed is even more insanely recent) isn’t nice. I can understand you’d not want ksh extglobs. That leaves us with… several variants of this, one somewhat tuned one could be (optionally remove the second line):

case $x in
([1-9]*) ;;
(*[1-9]*) x=${x#"${x%%[!0]*}"} ;;
(*) x=0 ;;
esac

This one has the assumption you know it’s a number. If not, skip the ksh part above (it’s also only safe for known numbers) and test for that first, which herein can be done elegantly: (again, optionally remove the now-third line)

case $x in
(*[!0-9]*) die NaN ;;
([1-9]*) ;;
(*[1-9]*) x=${x#${x%%[!0]*}} ;;
(*) x=0 ;;
esac

(Since we test for weird chars, can drop the double quotes again.)

A variation on this (safe for not-numbers but passing them through):

x=${x#"${x%%[!0]*}"}
x=${x:-0}

If you’re going to put that into a function, however, do make it…

if test -n "$KSH_VERSION$BASH_VERSION"; then
        test -z "$BASH_VERSION" || shopt -s extglob
        unfuck_number() {
                x=${x##+(0)}
                x=${x:-0}
        }
else
        unfuck_number() {
                x=${x#"${x%%[!0]*}"}
                x=${x:-0}
        }
fi

… instead, though; the extglob is that much faster internally. (And, yes, you’ll normally want to just enable it for GNU bash; it’ll be a nop if you stick to POSIX constructs anyway, and it allows you to provide ksh-bash-subset optimised functions where you want.)

In this case, you’ll likely want something closer to…

unfuck_number() {
        local unfuck_number_tmp
        eval "unfuck_number_tmp=\$$1"
        unfuck_number_tmp=${unfuck_number_tmp#"${unfuck_number_tmp%%[!0]*}"}
        eval "$1=\${unfuck_number_tmp:-0}"
}

… (takes an inout variable name) or…

unfuck_number() {
        local unfuck_number_tmp=${2#"${2%%[!0]*}"}
        eval "$1=\${unfuck_number_tmp:-0}"
}

… (takes an out variable name and an in value). Equivalent for the extglob variant, of course, though note ksh93 has no local but adding…

        [[ $KSH_VERSION = Version* ]] && alias local=typeset

… in the ksh/bash block will do; this excludes ksh86 and ksh88, for which the tests are a bit more complex (see mircvs://contrib/code/Snippets/getshver if you want to add one for at least ksh88, which is an sh on many old systems).

IFS hacks likely aren’t useful here.

Hmmh, already getting tired (loooooong day, bomb defusing, train chaos, choir, extreme train chaos…) so this is all I can think of for now.

@mirabilos
Sometimes I forget that you're the author of a shell, and then you post one of these and then I go 👀😶 and I'm reminded again.

No I didn't read the full thing. Yes I did appreciate it 😂
@navi @dalias
replies
0
announces
0
likes
2