17

I need to parse a decimal integer that appears at the start of a string.

There may be trailing garbage following the decimal number. This needs to be ignored (even if it contains other numbers.)

e.g.

"1" => 1
" 42 " => 42
" 3 -.X.-" => 3
" 2 3 4 5" => 2

Is there a built-in method in the .NET framework to do this?

int.TryParse() is not suitable. It allows trailing spaces but not other trailing characters.

It would be quite easy to implement this but I would prefer to use the standard method if it exists.

5
  • I'm assuming you hate regular expressions, but I think that's the kind of problem they're meant to solve...
    – axel_c
    Commented Oct 13, 2009 at 16:17
  • Using a regular expression is fine. But if there's a built-in function that would be preferable.
    – finnw
    Commented Oct 13, 2009 at 16:32
  • Is a valid "integer" character always followed or only ever preceded by a space character?
    – ChrisBD
    Commented Oct 13, 2009 at 16:33
  • @ChrisBD, there are not necessarily any spaces at all. But the first non-space character is always a digit.
    – finnw
    Commented Oct 13, 2009 at 16:56
  • Possible duplicate of Find and extract a number from a string Commented Aug 8, 2016 at 6:42

11 Answers 11

29

You can use Linq to do this, no Regular Expressions needed:

public static int GetLeadingInt(string input)
{
   return Int32.Parse(new string(input.Trim().TakeWhile(c => char.IsDigit(c) || c == '.').ToArray()));
}

This works for all your provided examples:

string[] tests = new string[] {
   "1",
   " 42 ",
   " 3 -.X.-",
   " 2 3 4 5"
};

foreach (string test in tests)
{
   Console.WriteLine("Result: " + GetLeadingInt(test));
}
3
  • 3
    Why are you calling ToCharArray? String already implements IEnumerable<char>.
    – Jon Skeet
    Commented Oct 13, 2009 at 16:33
  • 1
    Nice solution. One question... is the || c == '.' actually needed? The examples don't show anything but integer results. If removed it would speed it up a bit which may be significant if there are many extractions. Commented Jun 29, 2017 at 15:03
  • 2
    This is pretty inefficient, creating at least four intermediate objects for an operation that can be done with zero. Commented Dec 21, 2018 at 18:55
17
foreach (var m in Regex.Matches(" 3 - .x. 4", @"\d+"))
{
    Console.WriteLine(m);
}

Updated per comments

Not sure why you don't like regular expressions, so I'll just post what I think is the shortest solution.

To get first int:

Match match = Regex.Match(" 3 - .x. - 4", @"\d+");
if (match.Success)
    Console.WriteLine(int.Parse(match.Value));
3
  • I only need the first number, so you could stick a 'break' in there.
    – finnw
    Commented Oct 13, 2009 at 16:40
  • @finnw: Was confused by the comment you made on another answer. To get the first value use the Regex.Match function, it can be seen in one of my Rollbacks. Commented Oct 13, 2009 at 16:42
  • @Yuriy, I was referring to multi-digit numbers (e.g. "42"), not multiple numbers in the string.
    – finnw
    Commented Oct 13, 2009 at 16:48
6

There's no standard .NET method for doing this - although I wouldn't be surprised to find that VB had something in the Microsoft.VisualBasic assembly (which is shipped with .NET, so it's not an issue to use it even from C#).

Will the result always be non-negative (which would make things easier)?

To be honest, regular expressions are the easiest option here, but...

public static string RemoveCruftFromNumber(string text)
{
    int end = 0;

    // First move past leading spaces
    while (end < text.Length && text[end] == ' ')
    {
        end++;
    }

    // Now move past digits
    while (end < text.Length && char.IsDigit(text[end]))
    {
        end++;
    }

    return text.Substring(0, end);
}

Then you just need to call int.TryParse on the result of RemoveCruftFromNumber (don't forget that the integer may be too big to store in an int).

9
  • The garbage is at the end of the string, not the start (I do not consider the leading space to be garbage, since the built-in functions like int.Parse can handle that.)
    – finnw
    Commented Oct 13, 2009 at 16:25
  • Okay, edited. (Was this the reason for the downvote? If not, I'd be interested to hear what it was for...)
    – Jon Skeet
    Commented Oct 13, 2009 at 16:34
  • "Was this the reason for the downvote? If not, I'd be interested to hear what it was for..." it's like Federer bitching about the ref telling him to be quiet. Commented Oct 13, 2009 at 16:51
  • 2
    @Yuriy: I'm afraid I don't understand your comment. I always like to hear why I'm being downvoted, so that I can improve my answer. @finnw: Yes, this answer could very easily be simplified to a regex - I didn't do so based on your expression of dislike for regexes in the question :) Let me know if you want me to put that in the answer.
    – Jon Skeet
    Commented Oct 13, 2009 at 17:07
  • 3
    Unless you edited your question, he may well have been unable to remove the downvote. The system is unfortunate that way sometimes.
    – Jon Skeet
    Commented Oct 13, 2009 at 18:24
5

I like @Donut's approach.

I'd like to add though, that char.IsDigit and char.IsNumber also allow for some unicode characters which are digits in other languages and scripts (see here).
If you only want to check for the digits 0 to 9 you could use "0123456789".Contains(c).

Three example implementions:

To remove trailing non-digit characters:

var digits = new string(input.Trim().TakeWhile(c =>
    ("0123456789").Contains(c)
).ToArray());

To remove leading non-digit characters:

var digits = new string(input.Trim().SkipWhile(c =>
    !("0123456789").Contains(c)
).ToArray());

To remove all non-digit characters:

var digits = new string(input.Trim().Where(c =>
    ("0123456789").Contains(c)
).ToArray());

And of course: int.Parse(digits) or int.TryParse(digits, out output)

1
  • 1
    IMHO slightly more efficient to replace ("0123456789").Contains(c) with c >= '0' && c <= '9' Commented Dec 26, 2019 at 22:34
2

This doesn't really answer your question (about a built-in C# method), but you could try chopping off characters at the end of the input string one by one until int.TryParse() accepts it as a valid number:

for (int p = input.Length;  p > 0;  p--)
{
    int  num;
    if (int.TryParse(input.Substring(0, p), out num))
        return num;
}
throw new Exception("Malformed integer: " + input);

Of course, this will be slow if input is very long.

ADDENDUM (March 2016)

This could be made faster by chopping off all non-digit/non-space characters on the right before attempting each parse:

for (int p = input.Length;  p > 0;  p--)
{
    char  ch;
    do
    {
        ch = input[--p];
    } while ((ch < '0'  ||  ch > '9')  &&  ch != ' '  &&  p > 0);
    p++;

    int  num;
    if (int.TryParse(input.Substring(0, p), out num))
        return num;
}
throw new Exception("Malformed integer: " + input);
1
string s = " 3 -.X.-".Trim();
string collectedNumber = string.empty;
int i;

for (x = 0; x < s.length; x++) 
{

  if (int.TryParse(s[x], out i))
     collectedNumber += s[x];
  else
     break;     // not a number - that's it - get out.

} 

if (int.TryParse(collectedNumber, out i))
    Console.WriteLine(i); 
else
    Console.WriteLine("no number found");
2
  • @finnw- then just throw another if statement inside the first one to iterate to the following position to check
    – TStamper
    Commented Oct 13, 2009 at 16:19
  • @finnw Ok, here is another iteration that handles multiple numbers Commented Oct 13, 2009 at 16:26
1

This is how I would have done it in Java:

int parseLeadingInt(String input)
{
    NumberFormat fmt = NumberFormat.getIntegerInstance();
    fmt.setGroupingUsed(false);
    return fmt.parse(input, new ParsePosition(0)).intValue();
}

I was hoping something similar would be possible in .NET.

This is the regex-based solution I am currently using:

int? parseLeadingInt(string input)
{
    int result = 0;
    Match match = Regex.Match(input, "^[ \t]*\\d+");
    if (match.Success && int.TryParse(match.Value, out result))
    {
        return result;
    }
    return null;
}
1

Similar to Donut's above but with a TryParse:

    private static bool TryGetLeadingInt(string input, out int output)
    {
        var trimmedString = new string(input.Trim().TakeWhile(c => char.IsDigit(c) || c == '.').ToArray());
        var canParse = int.TryParse( trimmedString, out output);
        return canParse;
    }
0

Might as well add mine too.

        string temp = " 3 .x£";
        string numbersOnly = String.Empty;
        int tempInt;
        for (int i = 0; i < temp.Length; i++)
        {
            if (Int32.TryParse(Convert.ToString(temp[i]), out tempInt))
            {
                numbersOnly += temp[i];
            }
        }

        Int32.TryParse(numbersOnly, out tempInt);
        MessageBox.Show(tempInt.ToString());

The message box is just for testing purposes, just delete it once you verify the method is working.

0

I'm not sure why you would avoid Regex in this situation.

Here's a little hackery that you can adjust to your needs.

" 3 -.X.-".ToCharArray().FindInteger().ToList().ForEach(Console.WriteLine);

public static class CharArrayExtensions
{
    public static IEnumerable<char> FindInteger(this IEnumerable<char> array)
    {
        foreach (var c in array)
        {
            if(char.IsNumber(c))
                yield return c;
        }
    }
}

EDIT: That's true about the incorrect result (and the maintenance dev :) ).

Here's a revision:

    public static int FindFirstInteger(this IEnumerable<char> array)
    {
        bool foundInteger = false;
        var ints = new List<char>();

        foreach (var c in array)
        {
            if(char.IsNumber(c))
            {
                foundInteger = true;
                ints.Add(c);
            }
            else
            {
                if(foundInteger)
                {
                    break;
                }
            }
        }

        string s = string.Empty;
        ints.ForEach(i => s += i.ToString());
        return int.Parse(s);
    }
2
  • 1
    That's pretty clever. Of course the maintenance dev will hate you. Commented Oct 13, 2009 at 16:31
  • That would give an incorrect result for numbers longer than 1 digit.
    – finnw
    Commented Oct 13, 2009 at 16:35
0
    private string GetInt(string s)
    {
        int i = 0;

        s = s.Trim();
        while (i<s.Length && char.IsDigit(s[i])) i++;

        return s.Substring(0, i);
    }
1
  • I guessed that also, but I wasn't aware it existed... anyway glad I learned something and took -1 in the figure ;)
    – manji
    Commented Oct 13, 2009 at 18:08

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.