168

I have a list that countain values, one of the values I got is 'nan'

countries= [nan, 'USA', 'UK', 'France']

I tried to remove it, but I everytime get an error

cleanedList = [x for x in countries if (math.isnan(x) == True)]
TypeError: a float is required

When I tried this one :

cleanedList = cities[np.logical_not(np.isnan(countries))]
cleanedList = cities[~np.isnan(countries)]

TypeError: ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''
5
  • 5
    That looks like the string "nan", not an actual NaN value.
    – BrenBarn
    Commented Jan 9, 2014 at 4:49
  • 2
    yes, it is a string. [x for x in countries if x != 'nan']
    – MarshalSHI
    Commented Jan 9, 2014 at 4:52
  • 4
    if condition == True is unnecessary, you can always just do if condition.
    – reem
    Commented Jan 9, 2014 at 5:35
  • No solution provided so far are not satisfying. I have the same problem. Basically, it does not work for strings. Therefore in your case np.isnan('USA') will send the same error message. If I find some solution I will upload it. Commented Jan 26, 2017 at 12:52
  • Using math.isnan like in this answer is the pythonic way.
    – user7864386
    Commented Feb 9, 2022 at 5:20

16 Answers 16

218

The question has changed, so too has the answer:

Strings can't be tested using math.isnan as this expects a float argument. In your countries list, you have floats and strings.

In your case the following should suffice:

cleanedList = [x for x in countries if str(x) != 'nan']

Old answer

In your countries list, the literal 'nan' is a string not the Python float nan which is equivalent to:

float('NaN')

In your case the following should suffice:

cleanedList = [x for x in countries if x != 'nan']
9
  • 1
    Logically, what you say is true. But it didn't work out with me. Commented Jan 9, 2014 at 5:02
  • Then the problem is in another area, the array you gave is strings which math.isnan will naturall through errors with.
    – user764357
    Commented Jan 9, 2014 at 5:06
  • Yes ! when I print the output, I got this : [nan, 'USA', 'UK', 'France'] Commented Jan 9, 2014 at 5:07
  • 1
    @user3001937 I've updated the answer based on the new information
    – user764357
    Commented Jan 9, 2014 at 5:15
  • 2
    zhangxaochen: it is not a string, it is a float. Look carefully at the updated answer; Lego Stormtroopr's converting x to a string so you can compare it. nan always returns false for ==, even when compared to nan, so that's the easiest way to compare it. Commented Jan 9, 2014 at 6:30
65

Using your example where...

countries= [nan, 'USA', 'UK', 'France']

Since nan is not equal to nan (nan != nan) and countries[0] = nan, you should observe the following:

countries[0] == countries[0]
False

However,

countries[1] == countries[1]
True
countries[2] == countries[2]
True
countries[3] == countries[3]
True

Therefore, the following should work:

cleanedList = [x for x in countries if x == x]
1
  • 5
    This is the only answer that works when you have a float('nan') in a list of strings
    – user2317421
    Commented Jun 15, 2019 at 21:56
24

The problem comes from the fact that np.isnan() does not handle string values correctly. For example, if you do:

np.isnan("A")
TypeError: ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

However the pandas version pd.isnull() works for numeric and string values:

import pandas as pd
pd.isnull("A")
> False

pd.isnull(3)
> False

pd.isnull(np.nan)
> True

pd.isnull(None)
> True
17
import numpy as np

mylist = [3, 4, 5, np.nan]
l = [x for x in mylist if ~np.isnan(x)]

This should remove all NaN. Of course, I assume that it is not a string here but actual NaN (np.nan).

4
  • 4
    This gives me error: TypeError: ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''
    – Zak Keirn
    Commented Jan 9, 2019 at 23:46
  • 1
    Why not simply: x[~ np.isnan(x)] ? No list comprehension needed in numpy. Of course, I assume x is a numpy array.
    – bue
    Commented May 29, 2019 at 3:18
  • I assumed x is not going to be a numpy array as the question suggested.
    – Ajay Shah
    Commented May 29, 2019 at 5:41
  • 1
    It will expect float. Won't work on lists with strings @ZakKeirn Commented Aug 4, 2020 at 11:08
13

I like to remove missing values from a list like this:

import pandas as pd
list_no_nan = [x for x in list_with_nan if pd.notnull(x)]
7

use numpy fancy indexing:

In [29]: countries=np.asarray(countries)

In [30]: countries[countries!='nan']
Out[30]: 
array(['USA', 'UK', 'France'], 
      dtype='|S6')
6

if you check for the element type

type(countries[1])

the result will be <class float> so you can use the following code:

[i for i in countries if type(i) is not float]
5

A way to directly remove the nan value is:

import numpy as np    
countries.remove(np.nan)
1
  • 1
    Keep in mind, that if the list contains more than one matching the specified value, only the first one is deleted by remove().
    – bpelhos
    Commented Oct 20, 2021 at 13:04
4

Another way to do it would include using filter like this:

countries = list(filter(lambda x: str(x) != 'nan', countries))
3

If you have a list of items of different types and you want to filter out NaN, you can do the following:

import math
lst = [1.1, 2, 'string', float('nan'), {'di':'ct'}, {'set'}, (3, 4), ['li', 5]]
filtered_lst = [x for x in lst if not (isinstance(x, float) and math.isnan(x))]

Output:

[1.1, 2, 'string', {'di': 'ct'}, {'set'}, (3, 4), ['li', 5]]
2

In your example 'nan' is a string so instead of using isnan() just check for the string

like this:

cleanedList = [x for x in countries if x != 'nan']
2

In my opinion most of the solutions suggested do not take into account performance. Loop for and list comprehension are not valid solutions if your list has many values. The solution below is more efficient in terms of computational time and it doesn't assume your list has numbers or strings.

import numpy as np
import pandas as pd
list_var = [np.nan, 4, np.nan, 20,3, 'test']
df = pd.DataFrame({'list_values':list_var})
list_var2 = list(df['list_values'].dropna())
print("\n* list_var2 = {}".format(list_var2))
0

exclude 0 from the range list

['ret'+str(x) for x in list(range(-120,241,5)) if (x!=0) ]
0

I had a similar problem to solve, and strangely none of the suggested above worked (python 3.7.9):

but this one did:

df['colA'] = df['colA'].apply(lambda x: [item for item in x if not pd.isna(item)])
-1

I noticed that Pandas for example will return 'nan' for blank values. Since it's not a string you need to convert it to one in order to match it. For example:

ulist = df.column1.unique() #create a list from a column with Pandas which 
for loc in ulist:
    loc = str(loc)   #here 'nan' is converted to a string to compare with if
    if loc != 'nan':
        print(loc)
-3
import numpy as np
countries=[x for x in countries if x is not np.nan]
2
  • 4
    Welcome to Stack Overflow. Code is a lot more helpful when it is accompanied by an explanation. SO is about learning, not providing snippets to blindly copy and paste. This is particularly important when answering old questions with existing answers (this question is nearly 9 years old, and has 15 answers). Please edit your answer and explain how it answers the specific question being asked, and how it improves upon what is already here. See How to Answer.
    – Chris
    Commented Sep 18, 2022 at 18:04
  • Sorry, I picked the wrong reason in review queue audit, the suggested edit should be rejected as "clearly conflicts with author intent" instead. It's author's responsibility to make answer helpful, other people should write their own answers.
    – STerliakov
    Commented Jan 21, 2023 at 20:05

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.