Timeline for sed's greedy match shouldn't match this string, but does
Current License: CC BY-SA 4.0
Post Revisions
14 events
| when toggle format | what | by | license | comment | |
|---|---|---|---|---|---|
| Mar 8, 2023 at 0:21 | history | became hot network question | |||
| Mar 7, 2023 at 20:00 | vote | accept | user2153235 | ||
| Mar 7, 2023 at 19:59 | comment | added | user2153235 | I read steeldriver's two links, and Ed Morton's link. There's a lot more the regex processing than I imagined. I thought that it tries to fulfill each group from left to right, and was not aware of backtracking. Thank you for that. I may never fully get all the ways that a regex engine may try to satisfy the expression, but I know it's a lot more capable than I was assuming. | |
| Mar 7, 2023 at 17:46 | comment | added | Ed Morton | For the OP - you may find this article useful as it describes at a high level how BRE and ERE (i.e. non-greedy) leftmost-longest matching works: boost.org/doc/libs/1_34_0/libs/regex/doc/…. | |
| Mar 7, 2023 at 17:22 | comment | added | terdon♦ | @EdMorton yes, that's how I answered, explaining that it's the entire regex that needs to match, not just a part of it. | |
| Mar 7, 2023 at 17:21 | history | edited | Ed Morton | CC BY-SA 4.0 |
deleted 3 characters in body
|
| Mar 7, 2023 at 17:15 | comment | added | Ed Morton |
@terdon that's what I thought at first too but now I think the OP is confused that the regexp .+\t can match anything\t because they think that .+ should match the whole line and so how could it match \t in the input when in their mind the \t should be consumed by the .+. All the stuff about "alphanumeric" in the question is IMHO just a red herring and it's just "how can .+X match fooX when .+ alone matches fooX"?.
|
|
| Mar 7, 2023 at 17:00 | comment | added | terdon♦ |
I tried to explain in my answer but I think I might not quite understand what is confusing you. Could you clarify why "alphanumeric" is relevant here? Did you think that . only matches alphanumeric characters?
|
|
| Mar 7, 2023 at 16:58 | answer | added | terdon♦ | timeline score: 4 | |
| Mar 7, 2023 at 16:54 | comment | added | terdon♦ | No, it is entirely intended behavior. | |
| Mar 7, 2023 at 16:46 | comment | added | user2153235 | So it's not intended behaviour according to the documentation? That means it's hard to use the documented intended behaviour to craft regular expressions. | |
| Mar 7, 2023 at 16:42 | comment | added | steeldriver | It does this by repeated backtracking I think - see for example In regular expressions, what is a backtracking / back referencing? and also Runaway Regular Expressions: Catastrophic Backtracking | |
| Mar 7, 2023 at 16:33 | history | edited | terdon♦ | CC BY-SA 4.0 |
added 1 character in body; edited tags
|
| Mar 7, 2023 at 16:19 | history | asked | user2153235 | CC BY-SA 4.0 |