Skip to main content
62 events
when toggle format what by license comment
Jun 24, 2022 at 18:49 comment added ddyer If you're benchmarking a bubble sort, you're using the wrong algorithm.
S Jan 22, 2022 at 3:01 history bounty ended CommunityBot
S Jan 22, 2022 at 3:01 history notice removed user17242583
Jan 21, 2022 at 2:53 comment added Peter Cordes (update on that last comment: it is sometimes relevant for performance questions to include actual disassembly, so we can look for code alignment issues wrt. 32-byte boundaries, especially on Skylake-family CPUs where the JCC erratum mitigation can create performance pot-holes unexpectedly. You won't see that from Godbolt, and its binary output won't necessarily be linked with identical CRT code so addresses may differ.)
S Jan 21, 2022 at 2:41 history bounty started CommunityBot
S Jan 21, 2022 at 2:41 history notice added user17242583 Reward existing answer
Nov 6, 2021 at 21:53 audit First questions
Nov 6, 2021 at 21:53
Nov 6, 2021 at 21:53 audit First questions
Nov 6, 2021 at 21:53
Nov 5, 2021 at 15:37 audit First questions
Nov 5, 2021 at 15:37
Nov 5, 2021 at 15:07 audit First questions
Nov 5, 2021 at 15:07
Nov 4, 2021 at 8:48 audit First questions
Nov 4, 2021 at 8:48
Nov 4, 2021 at 6:50 audit First questions
Nov 4, 2021 at 6:51
Nov 3, 2021 at 17:16 audit First questions
Nov 3, 2021 at 17:17
Nov 3, 2021 at 11:22 audit First questions
Nov 3, 2021 at 13:37
Nov 2, 2021 at 7:26 audit First questions
Nov 2, 2021 at 8:09
Oct 29, 2021 at 12:50 audit First questions
Oct 29, 2021 at 12:52
Oct 26, 2021 at 22:25 audit First questions
Oct 26, 2021 at 22:45
Oct 26, 2021 at 7:45 audit First questions
Oct 26, 2021 at 8:27
Oct 26, 2021 at 4:57 audit First questions
Oct 26, 2021 at 5:18
Oct 24, 2021 at 21:36 audit First questions
Oct 24, 2021 at 21:37
Oct 24, 2021 at 0:29 audit First questions
Oct 24, 2021 at 0:29
Oct 23, 2021 at 13:19 audit First questions
Oct 23, 2021 at 13:32
Oct 23, 2021 at 6:00 audit First questions
Oct 23, 2021 at 6:03
Oct 20, 2021 at 12:50 audit First questions
Oct 20, 2021 at 12:51
Oct 19, 2021 at 14:22 audit First questions
Oct 19, 2021 at 14:22
Oct 19, 2021 at 9:23 audit First questions
Oct 19, 2021 at 9:51
Oct 18, 2021 at 2:43 audit First questions
Oct 18, 2021 at 2:43
Oct 17, 2021 at 17:44 audit First questions
Oct 17, 2021 at 18:00
Oct 17, 2021 at 15:59 audit First questions
Oct 17, 2021 at 15:59
Oct 17, 2021 at 14:48 history edited Peter Mortensen CC BY-SA 4.0
Active reading [<https://en.wikipedia.org/wiki/GNU_Compiler_Collection>]. Removed the shell prompts to avoid confusion. Added some context. Expanded.
Oct 17, 2021 at 9:13 audit First questions
Oct 17, 2021 at 9:13
Oct 16, 2021 at 12:53 audit First questions
Oct 16, 2021 at 13:35
Oct 16, 2021 at 10:40 audit First questions
Oct 16, 2021 at 11:52
Oct 14, 2021 at 20:32 audit First questions
Oct 14, 2021 at 21:24
Oct 14, 2021 at 16:11 audit First questions
Oct 14, 2021 at 16:28
Oct 14, 2021 at 8:54 audit First questions
Oct 14, 2021 at 8:54
Oct 14, 2021 at 2:16 audit First questions
Oct 14, 2021 at 3:57
Oct 13, 2021 at 11:26 audit First questions
Oct 13, 2021 at 11:26
Oct 12, 2021 at 11:58 history edited anon CC BY-SA 4.0
fix godbolt link
Oct 12, 2021 at 1:54 history edited Peter Cordes
it's not [swap] in general that's relevant, it's swapping adjacent items. Would like to tag [bubble-sort] and [cpu-architecture], but tags are very limited. [auto-vectorization] would be nice, too, but let's go with [cpu-architecture] in case that helps future readers find info about SF stalls.
Oct 11, 2021 at 20:41 comment added Peter Cordes @user253751: disagree; as long as the querent picked the same GCC version on Godbolt as they have locally so the instructions are the same, Godbolt's nice filtering of directives is better. And linking the source+asm on Godbolt makes it better for anyone who wants to see what other GCC versions / options do.
Oct 11, 2021 at 12:27 history edited Wai Ha Lee CC BY-SA 4.0
Embiggened formatting
Oct 11, 2021 at 10:57 audit First questions
Oct 11, 2021 at 11:21
Oct 11, 2021 at 9:49 comment added Stack Exchange Broke The Law You should include the assembly code that your actual compiler outputs, not from godbolt.org.
Oct 11, 2021 at 6:21 audit First questions
Oct 11, 2021 at 6:36
Oct 11, 2021 at 4:08 audit First questions
Oct 11, 2021 at 4:08
Oct 10, 2021 at 18:18 audit First questions
Oct 10, 2021 at 18:34
Oct 10, 2021 at 2:31 audit First questions
Oct 10, 2021 at 3:16
Oct 9, 2021 at 22:23 comment added Peter Cordes @DavidConrad: -Os would make GCC choose not to auto-vectorize, so it would be about the same as -O2 I'd expect, not shooting itself in the foot with store-forwarding stalls and increased latency before it can detect branch mispredicts.
Oct 9, 2021 at 22:16 history edited chqrlie CC BY-SA 4.0
added 2 characters in body
Oct 9, 2021 at 21:01 audit First questions
Oct 9, 2021 at 21:25
Oct 9, 2021 at 18:47 audit First questions
Oct 9, 2021 at 19:00
Oct 9, 2021 at 18:07 comment added David Conrad At least on older versions of gcc, -Os (optimize for space) sometimes produced the fastest code because of the size of the instruction cache on x86-64. I don't know if that would matter here or if it's still applicable in current versions of gcc but it might be interesting to try it and compare.
Oct 9, 2021 at 15:01 vote accept anon
Oct 9, 2021 at 10:36 history became hot network question
Oct 9, 2021 at 6:54 history edited Peter Cordes
This is specific to the swap of adjacent elements. We don't have room in tags for [bubble-sort] and [sort] as well, or [cpu-architecture] [amd-processor] etc. :/
Oct 9, 2021 at 3:18 history edited anon CC BY-SA 4.0
added 30 characters in body
Oct 9, 2021 at 3:09 answer added Peter Cordes timeline score: 176
Oct 9, 2021 at 2:56 comment added Peter Cordes @Abel: gcc -Ofast is just a shortcut for -O3 -ffast-math, but there's no FP math here. If you're going to try anything, try -O3 -march=native to let it use AVX2 in case GCC's vectorization strategy could help with wider vectors instead of hurt, whatever it's trying to do. Although I don't think so; it's just doing a 64-bit load and shuffle, not even 128-bit with SSE2.
Oct 9, 2021 at 2:49 history edited phuclv
edited tags
S Oct 9, 2021 at 2:35 review First questions
Oct 9, 2021 at 3:19
S Oct 9, 2021 at 2:35 history asked anon CC BY-SA 4.0