Skip to main content

You are not logged in. Your edit will be placed in a queue until it is peer reviewed.

We welcome edits that make the post easier to understand and more valuable for readers. Because community members review edits, please try to make the post substantially better than how you found it, for example, by fixing grammar or adding additional resources and hyperlinks.

Required fields*

8
  • 2
    Iterating could be a tool to help during debugging, say with ipdb, especially during initial development and understanding of edge cases. Commented Sep 19, 2023 at 18:29
  • 2
    @GabrielStaples Look here: "... You can pass a list of columns to [] to select columns in that order. ..." Commented Sep 22, 2023 at 11:41
  • 1
    I used a lot of what you said and ran with it. Check out these 13 techniques I came up with and the plot that shows their speed differences. Pure vectorization is 1400x faster. List comprehension is pretty good too! Commented Oct 11, 2023 at 4:58
  • 3
    What is you want to perform some kind of IO on a dataframe, row by row. For example, if you need to send individual rows to some "thing", what other approach would you use other than iterrows. eg: Sending rows to a network socket, or Kafka, as individual records. You could perhaps use something like .apply, but that is quite a messy solution. It can be done without the use of global variables using a functor (class) but the resulting code is hard to understand. Commented Feb 8, 2024 at 12:31
  • 1
    @FreelanceConsultant, in my answer, try techniques 10, 5, 9, and 1, in that order. All of those are faster than using iterrows(). In all of those techniques, you would replace my calculate_val() function call with your do_io() function call. I think that would work and technique 10 (10_list_comprehension_w_zip_and_direct_variable_assignment_passed_to_func) would be one of the fastest and easiest techniques possible. Commented Feb 8, 2024 at 15:12