2

I have a data set of behavioral data. I want to assign all the different behaviors as "aggressive", "submissive", "affiliative", or leave blank in a column of the data frame.

There are multiple types of each of these behaviors. So for example "fin raise" and "fast approach" are both aggressive behaviors.

I tried this:

if (G14$Behavior == "slow approach" | "fin raise" | "fast approach" | "tail beat" | "ram" | "bite") {
    G14$`Behavioral category` <- "aggressive"
  } else if (G14$Behavior == "flee" | "avoid" | "tail quiver") {
  G14$`Behavioral category` <- "submissive"
} else if (G14$Behavior == "bump" | "join") {
  G14$`Behavioral category` <- "affiliative" 
} else {
  G14$`Behavioral category` <- ""
}

But got this error:

operations are possible only for numeric, logical or complex types

Is there anyway to do this with string characters?

2
  • 2
    Glad you found an answer! You may want to read What is the difference between %in% and ==? to better understand why %in% is optimal here over ==. Good luck and happy coding!
    – jpsmith
    Commented Feb 18 at 15:05
  • It's easier to help you if you include a simple reproducible example with sample input and desired output that can be used to test and verify possible solutions.
    – MrFlick
    Commented Feb 18 at 16:17

4 Answers 4

4

The answer you provided works, but this would work slightly better:

case_when(Behavior %in% c("slow approach", "fin raise", "fast approach",
                          "tail beat", "ram", "bite") ~ "aggressive",
          Behavior %in% c("flee", "avoid", "tail quiver") ~ "submissive",
     ...)

(%in% is base-R, so it will work for people who don't want to use tidyverse; matching against strings is more precise and faster than matching against regular expressions)

2

I was able to figure it out!! For those who experience the same problems, using the dplyr and stringr packages provide the functions case_when and str_detect. It would look something like this:

   G14 <- G14 %>% mutate(Behavioral.category =(
      case_when(
        str_detect(Behavior, "slow approach|fin raise|fast approach|bite") ~ "aggressive",
        str_detect(Behavior, "flee|avoid|tail quiver") ~ "submissive",
        str_detect(Behavior, "bump|join") ~ "affiliative"
      )
    ))
1

While using %in% is perhaps the appropriate solution here, you may have searched for grepl, where you can use such patterns that include '|' operators. I'd prefer using NA for non-matches, obviously it's up to you to encode remaining categories differently.

> within(G14, {
+   Behavior_cat <- NA
+   Behavior_cat[
+     grepl("slow approach|fin raise|fast approach|tail beat|ram|bite", Behavior)
+   ] <- "aggressive"
+   Behavior_cat[
+     grepl("flee|avoid|tail quiver", Behavior)
+   ] <- "submissive"
+   Behavior_cat[
+     grepl("bump|join", Behavior)
+   ] <- 'affiliative'
+ })
          Behavior Behavior_cat
1    slow approach   aggressive
2        fin raise   aggressive
3    fast approach   aggressive
4        tail beat   aggressive
5              ram   aggressive
6             bite   aggressive
7             flee   submissive
8            avoid   submissive
9      tail quiver   submissive
10            bump  affiliative
11            join  affiliative
12 random behavior         <NA>

Here's an alternative solution using stringi::stri_replace_all_regex:

> G14 |> 
+   transform(
+     Behavior_cat=stringi::stri_replace_all_regex(
+       Behavior,
+       list(c('slow approach|fin raise|fast approach|tail beat|ram|bite'),
+            c('flee|avoid|tail quiver'),
+            c('bump|join'), c('random behavior')),
+       list('aggressive', 'submissive', 'affiliative', NA_character_),
+       vectorize_all=FALSE)
+   )
          Behavior Behavior_cat
1    slow approach   aggressive
2        fin raise   aggressive
3    fast approach   aggressive
4        tail beat   aggressive
5              ram   aggressive
6             bite   aggressive
7             flee   submissive
8            avoid   submissive
9      tail quiver   submissive
10            bump  affiliative
11            join  affiliative
12 random behavior         <NA>

Note, that these also match word parts so far. To only match whole words, include boundary metacharacters, or ^ and $ to denote start and end of a pattern, as shown e.g. in this answer.


Data:

> dput(G14)
structure(list(Behavior = c("slow approach", "fin raise", "fast approach", 
"tail beat", "ram", "bite", "flee", "avoid", "tail quiver", "bump", 
"join", "random behavior"), Behavior_cat = c("aggressive", "aggressive", 
"aggressive", "aggressive", "aggressive", "aggressive", "aggressive", 
"aggressive", "aggressive", "aggressive", "aggressive", "aggressive"
)), row.names = c(NA, -12L), class = "data.frame")
1

1) match_case We can use case_match from dplyr. It takes a first argument which is a vector containing codes followed by arguments which are formulas with the possible codes on the left hand side and the replacements on the right.

library(dplyr)

G14 %>% 
  mutate(Behavioral.category = case_match(Behavior,
    c("slow approach", "fin raise", "fast approach", "bite") ~ "aggressive",
    c("flee", "avoid", "tail quiver") ~ "submissive",
    c("bump", "join") ~ "affiliative")
  )

giving the following using the input in the Note at the end

       Behavior Behavioral.category
1 slow approach          aggressive
2     fin raise          aggressive
3 fast approach          aggressive
4          bite          aggressive
5          flee          submissive
6         avoid          submissive
7   tail quiver          submissive
8          bump         affiliative
9          join         affiliative

2) fct_collapse First create a list L whose names are the replacement codes and whose values are vectors of existing codes and then use that with fct_collapse.

library(dplyr)
library(forcats)

L <- list(
  aggressive = c("slow approach", "fin raise", "fast approach", "bite"),
  submissive = c("flee", "avoid", "tail quiver"),
  affiliative = c("bump", "join")
)

G14 %>% mutate(Behavior.category = fct_collapse(Behavior, !!!L))

3) left_join We can also use left_join with L defined above.

library(dplyr)

G14 %>%
  left_join(stack(L), join_by(Behavior == values)) %>% 
  rename(Behavior.Category = ind)

4) Base R Using match with L from above we can obtain a Base R approach.

stk <- stack(L)
G14 |> transform(Behavior.category = stk$ind[match(Behavior, stk$values)])

Note

Input data used

G14 <- data.frame(Behavior = c("slow approach", "fin raise", "fast approach",
  "bite", "flee", "avoid", "tail quiver", "bump", "join"))

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.