3

Hi I don't feel very well with regular expressions. What I would like to achieve is to extract a numeric substring (only 0-9 digits) from the input string.

  • The numeric string that is searched should be preceded only by a semicolon (;), space ( ) or should be placed exactly at the begining of the input (not line).
  • The numeric string that is searched should be followed only by a semicolon (;), the end of line or the end of the input string.

Exemplary input:

;x; ;SrvId=3993;ad257c823; 435223;

Output:

435223

I tried: [ \A|;|[ ]]\d*[\r|;|\Z] but it did not worked, it did not even compiled.

5 Answers 5

2

Try this one:

string resultString = null;
try {
    resultString = Regex.Match(subjectString, @"(?<=\A|\s+|;)(\d+)(?=$|;|\Z)").Groups[1].Value;
} catch (ArgumentException ex) {
    // Syntax error in the regular expression
}

Break down :

(?<=\A|\s+|;)

Posiive lookbehind : start of input or at least one whitespace character or a semicolon.

(\d+) at least one digit

(?=$|;|\Z)

Positive lookahead either end of line, or semicolon or and of input.

Input : ;x; ;SrvId=3993;ad257c823; 435223;

Output of group 1 : 435223

5
  • @HasanKhan Can you provide an input which makes it to match where it shouldn't match?
    – FailedDev
    Commented Oct 12, 2011 at 11:41
  • @FailedDev it doesn't match the number at the end of line when there is more text in new line Commented Oct 13, 2011 at 4:31
  • @HasanKhan I don't see any new line in OP examples. Do you?
    – FailedDev
    Commented Oct 13, 2011 at 6:44
  • @FailedDev "The numeric string that is searched should be followed only by a semicolon (;), the end of line or the end of the input string." Commented Oct 13, 2011 at 6:47
  • @HasanKhan You can write whatever bold letters you want, however this still doesn't change the fact that there is no new line in the OPs post :) Also if my english don't deceive me the end of line should be the last part of the string. OP doesn't mention possibility of multiple new lines. So you are wrong - again :D
    – FailedDev
    Commented Oct 13, 2011 at 6:59
1

Try this regex:

^(?:[; ]?)(?:.*?)([0-9]+);$
1

Using ^.*[ ;](\d+)[;\n]?$ will capture the numbers you're interested in although you may have to change the \n to \r\n depending on the line endings of your input file.

0

The regular expression should be this:

"[; ]{1}[0-9]+($|[^0-9]+)"
1
  • This returns 435223; which includes the semi-colon Commented Oct 12, 2011 at 9:05
0

Try using this expression

(\d+\.?\d*|\.\d+)

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.