0

I have a problem. I have the following code:

strCommand = "There is a 20% chance that I fail the test"

# Splits command in words
words = nltk.word_tokenize(strCommand)
    
#Add tags to words
word_tags = nltk.pos_tag(words)

if (word for (word, pos) in word_tags if(pos[:2] == 'CD')):
    value = [word for (word, pos) in word_tags if(pos[:2] == 'CD')][0]
else:
    value = ""

This code splits the sentence into words and uses NLTK tagging, to know what type of word each word is. Then I want to ask if there is a word from the type CD (Cardinal Digit), if so... Then set the value to that word. Now when there is a number in my sentence, the code works, but when there isn't a number in it, it crashes because the array is empty where I set the value. I thought the code couldn't get there if it didn't found a number, but apparently the if statement doesn't return true or false.

How can I make it, that it does return true or false, so I won't enter the if?

1
  • 2
    The expression in your if-statement is a generator expression. Generator expressions are truthy. Commented Nov 8, 2020 at 13:39

3 Answers 3

1

If you need just first entry, you can use next() with generator:

value = next((word for (word, pos) in word_tags if(pos[:2] == 'CD')), "")

It will return empty string if generator is empty. Also it's quite more effective, because you don't need to build whole list in memory twice, next() will stop iterating on first value returned. This means it stops when the first CD is found!

Sign up to request clarification or add additional context in comments.

3 Comments

But what if there are multiple CD are found?
@A.Vreeswijk, exactly this statement will skip (not skip, just won't consume) them as you do in code example you provided.
@A.Vreeswijk, btw, your generator expression contains redundant parentheses (here: (word, pos) and here: (pos[:2] == 'CD')). Also there's str.startswith() which does exactly what you implemented using slicing. So "beautified" version of your generator will be: word for word, pos in word_tags if pos.startswith('CD').
1

You should simplify your if-check. Right now, you're doing the the if-check on a generator expression:

(word for (word, pos) in word_tags if(pos[:2] == 'CD'))

Which is not what you want to do.

The simplest way to do what you want would be something like this:

reqd_words = [word for (word, pos) in word_tags if(pos[:2] == 'CD')]
if reqd_words:
    value = reqd_words[0]
else:
    value = ""

Comments

1

The expression in your if-statement is a generator expression. Generator expressions are truthy. If you want to check if a condition holds for any item in a sequence, use any(condition for element in sequence).

https://docs.python.org/3/library/functions.html#any

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.