Skip to main content

Timeline for answer to How can I iterate over rows in a Pandas DataFrame? by coldspeed95

Current License: CC BY-SA 4.0

Post Revisions

42 events
when toggle format what by license comment
May 26, 2025 at 16:49 history edited wjandrea CC BY-SA 4.0
Improve formatting and grammar. Add example of vectorization to profile section.
Apr 15, 2024 at 21:35 comment added wjandrea It might be worth mentioning that sometimes you don't need to calculate the value for the previous row and instead you can use a cumulative function like cumsum() or cummin(). Here's one I solved with .cumprod().
Feb 13, 2024 at 19:03 history edited wjandrea CC BY-SA 4.0
Add image description.
Feb 8, 2024 at 15:12 comment added Gabriel Staples Let me know if those work. Technique 1 (1_raw_for_loop_using_regular_df_indexing) is truly a for loop and is guaranteed to work, but I think any of those techniques would work for general IO, even the faster ones, like 10.
Feb 8, 2024 at 15:12 comment added Gabriel Staples @FreelanceConsultant, in my answer, try techniques 10, 5, 9, and 1, in that order. All of those are faster than using iterrows(). In all of those techniques, you would replace my calculate_val() function call with your do_io() function call. I think that would work and technique 10 (10_list_comprehension_w_zip_and_direct_variable_assignment_passed_to_func) would be one of the fastest and easiest techniques possible.
Feb 8, 2024 at 12:31 comment added user2138149 What is you want to perform some kind of IO on a dataframe, row by row. For example, if you need to send individual rows to some "thing", what other approach would you use other than iterrows. eg: Sending rows to a network socket, or Kafka, as individual records. You could perhaps use something like .apply, but that is quite a messy solution. It can be done without the use of global variables using a functor (class) but the resulting code is hard to understand.
Oct 11, 2023 at 4:58 comment added Gabriel Staples I used a lot of what you said and ran with it. Check out these 13 techniques I came up with and the plot that shows their speed differences. Pure vectorization is 1400x faster. List comprehension is pretty good too!
Sep 22, 2023 at 11:41 comment added Timus @GabrielStaples Look here: "... You can pass a list of columns to [] to select columns in that order. ..."
Sep 22, 2023 at 7:55 comment added Gabriel Staples What is the double-bracket syntax here? What does it mean and where is it officially documented? df[['col1', ...,'coln']].to_numpy()
Sep 20, 2023 at 4:45 history edited Gabriel Staples CC BY-SA 4.0
very minor edit: add returns between the list comprehension examples so as to make their important differences more distinguishable
Sep 19, 2023 at 18:29 comment added b_dev Iterating could be a tool to help during debugging, say with ipdb, especially during initial development and understanding of edge cases.
Apr 16, 2023 at 20:14 history edited coldspeed95 CC BY-SA 4.0
added 90 characters in body
Apr 15, 2023 at 14:57 history edited pho CC BY-SA 4.0
iteritems is deprecated
Sep 19, 2022 at 17:18 history edited Peter Mortensen CC BY-SA 4.0
Active reading [<https://en.wiktionary.org/wiki/operation#Noun> <https://en.wiktionary.org/wiki/write-up#Noun>].
Mar 19, 2022 at 15:58 history edited user17242583 CC BY-SA 4.0
double negative :)
Oct 4, 2021 at 9:51 history edited coldspeed95 CC BY-SA 4.0
adding my personal order of preference
Dec 26, 2020 at 12:35 history edited coldspeed95 CC BY-SA 4.0
added 35 characters in body
Nov 6, 2020 at 9:01 history bounty awarded jezrael
Aug 13, 2020 at 13:18 history edited Makah CC BY-SA 4.0
link update
Jun 11, 2020 at 13:34 history edited Peter Mortensen CC BY-SA 4.0
Active reading [<en.wikipedia.org/wiki/Pandas_%28software%29> <en.wikipedia.org/wiki/Cython> <en.wikipedia.org/wiki/Python_%28programming_language%29> <en.wikipedia.org/wiki/NumPy>]. Made compliant with the Jon Skeet Decree - <twitter.com/PeterMortensen/status/976400000942034944>. Expanded.
Jun 8, 2020 at 8:50 history edited coldspeed95 CC BY-SA 4.0
increased size of n in the graph and added benchmarks for numpy vectorization
Jun 2, 2020 at 5:19 history edited coldspeed95 CC BY-SA 4.0
list comprehension caveats. Might come in handy to someone having trouble getting list comps to work for their data
May 20, 2020 at 9:09 history edited Erfan CC BY-SA 4.0
added 1 character in body
Apr 19, 2020 at 3:47 history edited coldspeed95 CC BY-SA 4.0
added 158 characters in body
Apr 19, 2020 at 3:11 history edited coldspeed95 CC BY-SA 4.0
add footnotes because people seem to keep downvoting the answer based on the comment upvote war under the question
Mar 7, 2020 at 5:14 history edited coldspeed95 CC BY-SA 4.0
Footnote
Feb 28, 2020 at 9:16 history edited coldspeed95 CC BY-SA 4.0
Correct the title
S Dec 31, 2019 at 16:16 history suggested d-cubed CC BY-SA 4.0
quite the novel, fixed one little type-o, then some redundant text
Dec 31, 2019 at 15:03 review Suggested edits
S Dec 31, 2019 at 16:16
Jul 21, 2019 at 20:17 history edited coldspeed95 CC BY-SA 4.0
Typos
Jun 5, 2019 at 20:44 history edited coldspeed95 CC BY-SA 4.0
add reference links
Jun 5, 2019 at 20:34 history edited coldspeed95 CC BY-SA 4.0
fleshing out the answer a bit, remove unnecessary detail
May 30, 2019 at 5:50 history edited coldspeed95 CC BY-SA 4.0
derp
May 7, 2019 at 6:41 history edited coldspeed95 CC BY-SA 4.0
added 28 characters in body
May 6, 2019 at 8:25 audit First posts
May 6, 2019 at 8:37
May 1, 2019 at 9:52 audit First posts
May 1, 2019 at 9:53
Apr 18, 2019 at 16:21 audit Low quality answers
Apr 18, 2019 at 16:22
Apr 14, 2019 at 4:57 history edited coldspeed95 CC BY-SA 4.0
guide users to the land of "vectorization"
Apr 7, 2019 at 20:42 history edited coldspeed95 CC BY-SA 4.0
added 274 characters in body
Apr 7, 2019 at 10:30 history edited coldspeed95 CC BY-SA 4.0
added 7 characters in body
Apr 7, 2019 at 10:15 history edited coldspeed95 CC BY-SA 4.0
added 303 characters in body
Apr 7, 2019 at 10:03 history answered coldspeed95 CC BY-SA 4.0