Tuesday, August 18, 2009

Chopping with BASH

We can use a form of variable expansion to select a specific substring, based on a specific character offset and length. Try typing in the following lines under bash:

$ EXCLAIM=cowabunga
$ echo ${EXCLAIM:0:3}
$ echo ${EXCLAIM:3:7}

This form of string chopping can come in quite handy; simply specify the character to start from and the length of the substring, all separated by colons.

$ MYVAR=foodforthought .jpg
$ echo ${MYVAR##*fo}
$ echo ${MYVAR#*fo}

In the first example, we typed ${MYVAR##*fo}. What exactly does this mean? Basically, inside the ${ }, we typed the name of the environment variable, two ##s, and a wildcard ("*fo"). Then, bash took MYVAR, found the longest substring from the beginning of the string "foodforthought.jpg " that matched the wildcard "*fo", and chopped it off the beginning of the string.

The second form of variable expansion shown above appears identical to the first, except it uses only one "#" -- and bash performs an almost identical process. It checks the same set of substrings as our first example did, except that bash removes the shortest match from our original string, and returns the result. So, as soon as it checks the "fo" substring, it removes "fo" from our string and returns "odforthought.jpg" .

$ MYFOO="chickensoup. tar.gz"
$ echo ${MYFOO%%.*}
$ echo ${MYFOO%.*}
chickensoup.tarAs you can see, the % and %% variable expansion options work identically to # and ##, except they remove the matching wildcard from the end of the string. Note that you don't have to use the "*" character if you wish to remove a specific substring from the end:

MYFOOD="chickensoup "
$ echo ${MYFOOD%%soup}

In this example, it doesn't matter whether we use "%%" or "%", since only one match is possible.

No comments: