Bash Snippet: HTML &#code; decoder

A short and simple way to decode HTML decimal (and hex) character codes in bash.

html_decode () {
    html_encoded="$1"
    html_encoded=${html_encoded//&#/ }
    html_encoded=(`echo $html_encoded`)
    for html_dec in ${html_encoded[@]}
    do
        html_dec="${html_dec//X/x}"
        html_dec="${html_dec//;/}"
        if [ "${html_dec:0:1}" == "x" ]; then
            html_hex=${html_hex:1:${#html_hex}}
        else
            html_hex="`printf "%02X\n" $html_dec`"
        fi
        echo -en "\x$html_hex"
    done
    echo ""
}

html_decode "Idaho"

On a side note, I’m totally bummed that the builtin ‘printf’ does decimal to hexadecimal conversion negating the need for my much uglier solution:

dec2hex () {
num="$1"
base16=(0 1 2 3 4 5 6 7 8 9 A B C D E F)
while [ "$num" -gt 0 ];
do
        hex=${base16[$(($num % 16))]}$hex
        num=$(($num / 16))
done
echo "$hex"
}
user@host:~$ dec2hex 48879
BEEF
echo $((16#BEEF))
user@host:~$ 48879
This entry was posted in Bash. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.