0

I have a same value when I am using this code. What am I doing wrong in the random?

data = data[data["VN"] >= 1000]
data_T1 = data[data["TARGET"] == 1]
data_T0 = data[data["TARGET"] == 0]
data_T0_random = data_T0.loc[np.random.choice(data_T0.index, 10000)]
data = data_T1.append(data_T0_random)
print('q:', len(data.index))
rr = data.drop_duplicates()
print('qq:', len(rr.index))
1
  • Some more context around the data structures would help. Commented May 3, 2018 at 11:31

2 Answers 2

1

Use replace=False

Ex:

data_T0_random=data_T0.loc[np.random.choice(data_T0.index, 10000, replace=False)]
Sign up to request clarification or add additional context in comments.

Comments

0

Change this line:

data_T0_random=data_T0.loc[np.random.choice(data_T0.index, 10000)]

to:

data_T0_random=random.sample(data_T0,10000)

More info:

random.choices(population, weights=None, *, cum_weights=None, k=1) Return a k sized list of elements chosen from the population with replacement. If the population is empty, raises IndexError.

random.sample(population, k) Return a k length list of unique elements chosen from the population sequence or set. Used for random sampling without replacement.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.