I have the following two dataframes badges and comments. I have created a list of 'gold users' from badges dataframe whose Class=1.
Here Name means the 'Name of Badge' and Class means the level of Badge (1=Gold, 2=Silver, 3=Bronze).
I have already done the text preprocessing on comments['Text']and now want to find the count of top 10 words for gold users from comments['Text'].
I tried the given code but am getting error:
"KeyError: "None of [Index(['1532', '290', '1946', '1459', '6094', '766', '10446', '3106', '1',\n '1587',\n ...\n '35760', '45979', '113061', '35306', '104330', '40739', '4181', '58888',\n '2833', '58158'],\n dtype='object', length=1708)] are in the [index]". Please provide me a way to fix this.
Dataframe 1 (badges)
Id | UserId | Name | Date |Class | TagBased
2 | 23 | Autobiographer | 2016-01-12T18:44:49.267 | 3 | False
3 | 22 | Autobiographer | 2016-01-12T18:44:49.267 | 3 | False
4 | 21 | Autobiographer | 2016-01-12T18:44:49.267 | 3 | False
5 | 20 | Autobiographer | 2016-01-12T18:44:49.267 | 3 | False
6 | 19 | Autobiographer | 2016-01-12T18:44:49.267 | 3 | False
Dataframe 2 (comments)
Id| Text | UserId
6| [2006, course, allen, knutsons, 2001, course, ... | 3
8| [also, theo, johnsonfreyd, note, mark, haimans... | 1
Code
for index,rows in comments.iterrows():
gold_comments = rows[comments.Text.loc[gold_users]]
Counter(gold_comments)
