Return to Answer

Post Timeline

(References to relative positions of answers are not reliable as they depend on the view, etc). Formatted according to the original intent in the Markdown source (as a result, the diff looks more extensive than it really is - use view "Side-by-side markdown" to compare)..

Source Link

edited Feb 8, 2021 at 15:41

Peter Mortensen

31.1k
22
111
134

The different approaches discussed in above responsesthe previous answers are based on the assumption that either the user knows column indices to drop or subset on, or the user wishes to subset a dataframe using a range of columns (for instance between 'C' : 'E').

pandas.DataFrame.drop() is certainly an option to subset data based on a list of columns defined by user (though you have to be cautious that you always use copy of dataframe and inplace parameters should not be set to True!!)

Another option is to use pandas.columns.difference(), which does a set difference on column names, and returns an index type of array containing desired columns. Following is the solution:

df = pd.DataFrame([[2,3,4], [3,4,5]], columns=['a','b','c'], index=[1,2])
columns_for_differencing = ['a']
df1 = df.copy()[df.columns.difference(columns_for_differencing)]
print(df1)

The output would be:b c 1 3 4 2 4 5

    b   c
1   3   4
2   4   5

The different approaches discussed in above responses are based on the assumption that either the user knows column indices to drop or subset on, or the user wishes to subset a dataframe using a range of columns (for instance between 'C' : 'E'). pandas.DataFrame.drop() is certainly an option to subset data based on a list of columns defined by user (though you have to be cautious that you always use copy of dataframe and inplace parameters should not be set to True!!)

Another option is to use pandas.columns.difference(), which does a set difference on column names, and returns an index type of array containing desired columns. Following is the solution:

df = pd.DataFrame([[2,3,4],[3,4,5]],columns=['a','b','c'],index=[1,2])
columns_for_differencing = ['a']
df1 = df.copy()[df.columns.difference(columns_for_differencing)]
print(df1)

The output would be:b c 1 3 4 2 4 5

The different approaches discussed in the previous answers are based on the assumption that either the user knows column indices to drop or subset on, or the user wishes to subset a dataframe using a range of columns (for instance between 'C' : 'E').

Another option is to use pandas.columns.difference(), which does a set difference on column names, and returns an index type of array containing desired columns. Following is the solution:

df = pd.DataFrame([[2,3,4], [3,4,5]], columns=['a','b','c'], index=[1,2])
columns_for_differencing = ['a']
df1 = df.copy()[df.columns.difference(columns_for_differencing)]
print(df1)

The output would be:

    b   c
1   3   4
2   4   5

Source Link

answered Jul 21, 2018 at 21:28

Harshit Lamba

Another option is to use pandas.columns.difference(), which does a set difference on column names, and returns an index type of array containing desired columns. Following is the solution:

df = pd.DataFrame([[2,3,4],[3,4,5]],columns=['a','b','c'],index=[1,2])
columns_for_differencing = ['a']
df1 = df.copy()[df.columns.difference(columns_for_differencing)]
print(df1)

The output would be:b c 1 3 4 2 4 5

Collectives™ on Stack Overflow