I have problems to understand the string pattern matching with =~ in bash.
I wrote following function (don't be alarmed - it's just experimenting, not a security approach with md5sum):
md5 () {
[[ "$(md5sum $1)" =~ $2* ]] && echo fine || echo baarr;
}
and tested it with some input. Here some reference:
md5sum wp.laenderliste
b1eb0d822e8d841249e3d68eeb3068d3 wp.laenderliste
It's unnecessarily hard to compare, if the source for the control sum does not contain the two blanks with the filename already. That's where the observations origins from, but more interesting than the many ways to solve that problem was my observation:
I define a control variable, and test my function with too short, but matching strings:
ok=b1eb0d822e8d841249e3d68eeb3068d3
for i in {29..32}; do md5 wp.laenderliste ${ok:1:$i} ;done
fine
fine
fine
fine
That's expected and fine, since it is the purpose of the function, to ignore the mismatch of the missing " wp.laenderliste" and therefore even longer mismatches.
Now, if I append random stuff, which does not match, I expect, of course, errors, and get them:
for i in {29..32}; do md5 wp.laenderliste ${ok:1:$i}GU ;done
baarr
baarr
baarr
baarr
As expected. But when there is only one, last mismatching character, see what happens:
for i in {29..32}; do md5 wp.laenderliste ${ok:1:$i}G ;done
fine
fine
fine
fine
Is this me, not realizing how this is supposed to work (select is broken), or is there really an off-by-one-error in bash's pattern matching?
Mismatches in the mid of the string matter from count 1:
for i in 5 9 e ; do echo md5 wp.laenderliste ${ok//$i/_} ;done
md5 wp.laenderliste b1eb0d822e8d841249e3d68eeb3068d3
md5 wp.laenderliste b1eb0d822e8d84124_e3d68eeb3068d3
md5 wp.laenderliste b1_b0d822_8d841249_3d68__b3068d3
for i in 5 9 e ; do md5 wp.laenderliste ${ok//$i/_} ;done
fine
baarr
baarr
The bash-version:
bash -version
GNU bash, Version 4.3.48(1)-release (x86_64-pc-linux-gnu)
Copyright (C) 2013 Free Software Foundation, Inc.
Lizenz GPLv3+: GNU GPL Version 3 oder jünger <http://gnu.org/licenses/gpl.html>
Disclaimer: md5sum is only a useful against unintentional mistakes, not against attacks. I don't encourage using it.
And this question is not a search for better solutions or workarounds. It's about the =~ Operator, whether it should act as it does and if so, why.