1

Consider the dataset :

activity <- c("play football", "basketball player", "guitar sono","cinema", "piano")
country_and_type <- c("uk", "uk", "spain", "uk", "uk")
dataset <- data.frame(activity, country_and_type)

|activity         |country_and_type   |
|play football    |uk                 |
|basketball playe |uk                 |
|guitar sono      |spain              |
|cinema           |uk                 |
|piano            |uk                 |

and these lists:

sport <- ("football", "basketball", "handball", "baseball")
music <- ("guitar", "piano", "microphone")

If the initial dataset$country_and_type value is "uk", my goal is to add the name of the lists in parentheses in dataset$country_and_type column based on the string match. If there is no value that match, the type should be "other".

To be clearer, here is the expected output:

|activity         |country_and_type   |
|play football    |uk (sport)         |
|basketball playe |uk (sport)         |
|guitar sono      |spain              |
|cinema           |uk  (other)        |
|piano            |uk (music)         |

Do you have an idea on how to make it?

1 Answer 1

1
dataset$type=NA
> dataset$type[grepl(paste(sport,collapse = "|"),a)]="sport"
> dataset$type[grepl(paste(music,collapse = "|"),a)]="music"
> dataset
                      a  type
1         play football sport
2     basketball player sport
3           guitar sono music
4          french piano music
5           ok handball sport
6         baseball game sport
7 microphone for singer music
>

After the edition:

> sp=grepl(paste(sport,".*uk",collapse = "|"),do.call(paste,dataset))
> ms=grepl(paste(music,".*uk",collapse = "|"),do.call(paste,dataset))
> uk=grepl("uk",do.call(paste,dataset))
> dataset$type=""
> dataset$type[sp]="(sport)"
> dataset$type[ms]="(music)"
> dataset$type[!(ms|sp)&uk]="(other)"
> transform(dataset,country_and_type=paste(country_and_type,type))[-3]
           activity country_and_type
1     play football       uk (sport)
2 basketball player       uk (sport)
3       guitar sono           spain 
4            cinema       uk (other)
5             piano       uk (music)
2
  • Thank you very much for your reply, but I made a mistake by explaining the context... I missed a condition (If the initial dataset$country_and_type value is "uk",); I edited my question.
    – Remi
    Commented Feb 22, 2018 at 19:36
  • Hello, I recently discovered a mistake with this method: when a string that is in the dataset contains a part of the string in the grep, the name of the list is affected. For example "pianopolis" will be associate to the value "piano". The associated value in output should be "other"
    – Remi
    Commented Apr 17, 2018 at 11:57

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.