Skip to main content
added 1 character in body
Source Link
chux
  • 36.5k
  • 2
  • 43
  • 97

No need to test twice in the loop for both ch != 0 and isaphaisalpha(ch). This can readily make a faster loop.

Since char may be signed, calling isalpha(ch) is undefined behavior (ubUB) when ch < 0 && ch != EOF. is...(ch) are specified to work when ch is in the range [0...UCHAR_MAX] and EOF. C2X draft § 7.4 1

No need to test twice in the loop for both ch != 0 and isapha(ch). This can readily make a faster loop.

Since char may be signed, calling isalpha(ch) is undefined behavior (ub) when ch < 0 && ch != EOF. is...(ch) are specified to work when ch is in the range [0...UCHAR_MAX] and EOF. C2X draft § 7.4 1

No need to test twice in the loop for both ch != 0 and isalpha(ch). This can readily make a faster loop.

Since char may be signed, calling isalpha(ch) is undefined behavior (UB) when ch < 0 && ch != EOF. is...(ch) are specified to work when ch is in the range [0...UCHAR_MAX] and EOF. C2X draft § 7.4 1

added 131 characters in body
Source Link
chux
  • 36.5k
  • 2
  • 43
  • 97

@user7649 suggestion to stop once a non-alpha is found is a good idea.

Avoid undefined behavior (UB)

Avoid undefined behavior (UB)

@user7649 suggestion to stop once a non-alpha is found is a good idea.

Avoid undefined behavior (UB)

added 743 characters in body
Source Link
chux
  • 36.5k
  • 2
  • 43
  • 97

A few years late to this party.


Take advantage that isalpha('\0') is false

CallingSince char may be signed, calling isalpha(ch) is undefined behavior (ub) when ch < 0 && ch != EOF. is...(ch) are specified to work when ch is in the range [0...UCHAR_MAX] and EOF. C2X draft § 7.4 1

Maybe better to say: "The parameter c pointpoints to a C string."

Be aware of locale issues

In the default locale there are 26 + 26 characters that return isalpha() as non-zero (true). Other locales may have more. Depending on coding goals this is an advantage or not.

If code goal is only the common A-Z, a-z regardless of locale, consider making your own table.

static const unsigned char is_az[UCHAR_MAX + 1] = { //
  ['A'] = 1, ['B'] = 1, /* 23 more */, ['Z'] = 1,
  ['a'] = 1, ['b'] = 1, /* 23 more */, ['z'] = 1,
};  // The rest will be 0.

unsigned char [] array used here as it is more compact that bool [] when bool is size > 1. We could use bits of the array, but let us leave that for later.

Take advantage that isalpha('\0') is false

Calling isalpha(ch) is undefined behavior (ub) when ch < 0 && ch != EOF. is...(ch) are specified to work when ch is in the range [0...UCHAR_MAX] and EOF. C2X draft § 7.4 1

Maybe better to say: "The parameter c point to a C string."

A few years late to this party.


Take advantage that isalpha('\0') is false

Since char may be signed, calling isalpha(ch) is undefined behavior (ub) when ch < 0 && ch != EOF. is...(ch) are specified to work when ch is in the range [0...UCHAR_MAX] and EOF. C2X draft § 7.4 1

Maybe better to say: "The parameter c points to a C string."

Be aware of locale issues

In the default locale there are 26 + 26 characters that return isalpha() as non-zero (true). Other locales may have more. Depending on coding goals this is an advantage or not.

If code goal is only the common A-Z, a-z regardless of locale, consider making your own table.

static const unsigned char is_az[UCHAR_MAX + 1] = { //
  ['A'] = 1, ['B'] = 1, /* 23 more */, ['Z'] = 1,
  ['a'] = 1, ['b'] = 1, /* 23 more */, ['z'] = 1,
};  // The rest will be 0.

unsigned char [] array used here as it is more compact that bool [] when bool is size > 1. We could use bits of the array, but let us leave that for later.

Source Link
chux
  • 36.5k
  • 2
  • 43
  • 97
Loading