0

I don't understand some code from Kaggle's solution.

Here is an example of the data:

PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
1,0,3,"Braund, Mr. Owen Harris",male,22,1,0,A/5 21171,7.25,,S
2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Thayer)",female,38,1,0,PC 17599,71.2833,C85,C
3,1,3,"Heikkinen, Miss. Laina",female,26,0,0,STON/O2. 3101282,7.925,,S

The goal is to extract an array with only the female, and they do it like this:

# data contains all the passengers
women_only_stats = data[0::,4] == "female"
females_data = data[women]
print(data[women][0]) # Will print the first women of the dataset of only women. 

I understand that women_data_only will be an array of True and False which is the result of the evaluation of the expression data[0::,4] == "female".
What I do not understand is why data[women] is an array of only women?


How is numpy evaluate that?

3

1 Answer 1

1

Here's how it works:

women_only_stats = data[0::,4] == "female" will create a mask (array of booleans) for the indices of your dataframe.

When passed to data, the mask will do a projection on the samples where women_only_stats is True, thus keeping only women.

You can have a look here about mask indexing.

3
  • Thanks! So in the original dataframe, each 'female' will be replaced (overlayed) by the value True or False, and the resulting array will only keep the rows where the value of the 4th column is True. Am I right by saying this?
    – Mornor
    Commented Oct 6, 2016 at 15:47
  • It is not replacing it. It does not change the dataframe. It creates an array of booleans, called mask. Then when you pass this mask to your dataframe (i.e. when you index by this mask) it will make a projection on the samples where mask == True. Returning then a dataframe where samples are only women.
    – MMF
    Commented Oct 6, 2016 at 15:50
  • Ahhhh ! Get it!! Thanks a lot for this explanation, and for the time you took for me.
    – Mornor
    Commented Oct 6, 2016 at 16:07

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.