Revisions to Avoiding half-baked experiments: Setting clear standards for experiments and rollouts, beyond “Minimum”

added 15 characters in body

Source Link

edited May 10, 2025 at 6:10

M--

6k
2
19
51

If we assume that at least some level of experimentation is desired, we should be willing to accept some disturbances during experiments and some additional cleanup effort afterwards. It will not always be avoidable. Disturbances of the regular operation should be reduced though to a minimum and should be communicated in a way where it's clear why it's needed. There should already before be a plan for how to recover and clean up afterwards and this step should then be executed.

^{Example: Recent answer bot answers are not included in the data dump (good), have unclear content licenses (bad), and seem to remain on the sites even though the experiment failed (bad). It would have been better if all impacts of this experiment would have been cleaned up.}

^{Example: For 1-rep-vote experiment (which wasn't conducted but could have), one could have recorded votes from 1-rep users for a short time only (say 2 weeks), analyze their impact, then try to undo their impact as much as possible (vote reversal is available and done kind of regularly). If some residual effects (badges, caps) remained, it would be okay in my eyes. Foreyes; for science.}

From the past, I would say that the whole community, not only selected members, should be consulted already during the planning phase of an experiment.

^{Example: Before the beta rollout of collectives and articles, selected members of the community were consulted. The feature wasn't very well received, wasn't very successful and slowly died in the years following. It seems a bit as if only consulting a few members, doesn't give enough significant input. It's better to ask more membersgather as much feedback as possible.}

What's striking to me though is the half-bakedness of it all. Collectives or Discussions received very little changes after the initial rollout. It's almost like the company didn't want the features to succeed andwhich looks a bit like a waste of resources.

^{Example: The Staging ground worked when introduced, but took multiple rounds (although I just remember that the company gave up on it in the middle of the first round). The trending sort order also took multiple rounds of forth-and-back with the community (although I just saw that it seems to not yet be available outside of SO, why not?). The unfriendly comments robot had at least two versions/iterations (although it's not in use anymore apparently, why not?).}

I think that meta Q&As are already quite effective for communication but maybe there are better ways to structure discussions about experiments. Maybe dedicated "folders" for experiments, where all Q&A related to a specific experiment are kept together? Could be realized with the tag system or something else. But in general, I'm happy with the existing framework.

I have a hard time trying to come up with general applicable guidelines, but I think I can tell you when an experiment might work when I see it. So including the community in all stages might be a good idea and try to be as clear as possible with problem and solution descriptiondescriptions.

As for the size, I guess experiments can come in all sizes (1-rep-voting would be a very small change to the software), but it makes sense to go in smaller steps and check back often, but of course sometimes you have to make a larger jump if there aren't any reasonable intermediate steps available.

And we should be prepared to see experiments failing frequently. There are probably many more ways that you can do things wrong than right, but that still doesn't mean that the current state is the best possible or even close to that. Failed experiments should not result in doubts about possible system-wide change possible or taint ideas. One can fail, and try again, and succeed the next time.

^{And a final example: Dedicated thank you feature. 1/6 of all comments are supposed to be thank yous, we could get rid of these somehow. The company thought that a dedicated thank you button that otherwise is useless would be a good idea for that. Downside would be competition with the upvote button. After one recent experiment (that did a couple of things at the same time, which would typically be bad practice) it concluded that there is a 10% reduction in thank you comments (which isn't much) and will create the button. The metric is fine, the feature set is questionable (a button that otherwise doesn't do anything) but complete but stillcompete; yet the problem is not solved andwhile there would be much better approaches (AIe.g. AI assisted thank you comment detection with a note to upvote instead and an automatic scheduled comment deletion after X days, which would surely find more than 10% of thank you comments and additionally would not compete with upvotes but support them andespecially considering the technology is quite simple nowadays). The community advocated for this. If a far suboptimal solution is also seen as fail, then this experiment failed very early, when possible solutions were sought. Everything afterwards couldn't correct for the initial conceptional problem. Some might say that this experiment wasn't really necessary in this form.}

If we assume that at least some level of experimentation is desired, we should be willing to accept some disturbances during experiments and some additional cleanup effort afterwards. It will not always be avoidable. Disturbances of the regular operation should be reduced though to a minimum and should be communicated in a way where it's clear why it's needed. There should already before be a plan for how to recover and clean up afterwards and this step should then be executed.

^{Example: Recent answer bot answers are not included in data dump (good), have unclear content licenses (bad), seem to remain on the sites even though the experiment failed (bad). It would have been better if all impacts of this experiment would have been cleaned up.}

^{Example: 1-rep-vote experiment (which wasn't conducted but could have), one could have recorded votes from 1-rep users for a short time only (say 2 weeks), analyze their impact, then try to undo their impact as much as possible (vote reversal is available and done kind of regularly). If some residual effects (badges, caps) remained, it would be okay in my eyes. For science.}

From the past, I would say that the whole community, not only selected members should be consulted already during the planning phase of an experiment.

^{Example: Before the beta rollout of collectives and articles, selected members of the community were consulted. The feature wasn't very well received, wasn't very successful and slowly died in the years following. It seems a bit as if only consulting a few members, doesn't give enough significant input. It's better to ask more members.}

What's striking to me though is the half-bakedness of it all. Collectives or Discussions received very little changes after the initial rollout. It's almost like the company didn't want the features to succeed and looks a bit like a waste of resources.

^{Example: The Staging ground worked when introduced, but took multiple rounds (although I just remember that the company gave up on it in the middle of the first round). The trending sort order also took multiple rounds of forth-and-back with the community (although I just saw that it seems to not yet be available outside of SO, why not?). The unfriendly comments robot had at least two versions/iterations (although it's not in use anymore apparently, why not?).}

I think that meta Q&As are already quite effective for communication but maybe there are better ways to structure discussions about experiments. Maybe dedicated "folders" for experiments, where all Q&A related to a specific experiment are kept together? Could be realized with the tag system or something else. But in general I'm happy with the existing framework.

I have a hard time trying to come up with general applicable guidelines, but I think I can tell you when an experiment might work when I see it. So including the community in all stages might be a good idea and try to be as clear as possible with problem and solution description.

As for the size, I guess experiments can come in all sizes (1-rep-voting would be a very small change to the software), but it makes sense to go in smaller steps and check back often but of course sometimes you have to make a larger jump if there aren't any reasonable intermediate steps available.

And we should be prepared to see experiments failing frequently. There are probably many more ways that you can do things wrong than right, but that still doesn't mean that the current state is the best possible or even close to that. Failed experiments should not result in doubts about system-wide change possible or taint ideas. One can fail, and try again and succeed the next time.

^{And a final example: Dedicated thank you feature. 1/6 of all comments are supposed to be thank yous, we could get rid of these somehow. The company thought that a dedicated thank you button that otherwise is useless would be a good idea for that. Downside would be competition with the upvote button. After one recent experiment (that did a couple of things at the same time, which would typically be bad practice) it concluded that there is a 10% reduction in thank you comments (which isn't much) and will create the button. The metric is fine, the feature set is questionable (a button that otherwise doesn't do anything) but complete but still the problem is not solved and there would be much better approaches (AI assisted thank you comment detection with a note to upvote instead and an automatic scheduled comment deletion after X days, which would surely find more than 10% of thank you comments and additionally would not compete with upvotes but support them and the technology is quite simple nowadays). The community advocated for this. If a far suboptimal solution is also seen as fail, then this experiment failed very early, when possible solutions were sought. Everything afterwards couldn't correct for the initial conceptional problem. Some might say that this experiment wasn't really necessary in this form.}

If we assume that at least some level of experimentation is desired, we should be willing to accept some disturbances during experiments and some additional cleanup effort afterwards. It will not always be avoidable. Disturbances of the regular operation should be reduced though to a minimum and should be communicated in a way where it's clear why it's needed. There should already be a plan for how to recover and clean up afterwards and this step should then be executed.

^{Example: Recent answer bot answers are not included in the data dump (good), have unclear content licenses (bad), and seem to remain on the sites even though the experiment failed (bad). It would have been better if all impacts of this experiment would have been cleaned up.}

^{Example: For 1-rep-vote experiment (which wasn't conducted but could have), one could have recorded votes from 1-rep users for a short time only (say 2 weeks), analyze their impact, then try to undo their impact as much as possible (vote reversal is available and done kind of regularly). If some residual effects (badges, caps) remained, it would be okay in my eyes; for science.}

From the past, I would say that the whole community, not only selected members, should be consulted during the planning phase of an experiment.

^{Example: Before the beta rollout of collectives and articles, selected members of the community were consulted. The feature wasn't very well received, wasn't very successful and slowly died in the years following. It seems a bit as if only consulting a few members, doesn't give enough significant input. It's better to gather as much feedback as possible.}

What's striking to me though is the half-bakedness of it all. Collectives or Discussions received very little changes after the initial rollout. It's almost like the company didn't want the features to succeed which looks a bit like a waste of resources.

^{Example: The Staging ground worked when introduced, but took multiple rounds (although I just remember that the company gave up on it in the middle of the first round). The trending sort order also took multiple rounds of forth-and-back with the community (although it seems to not be available outside of SO, why not?). The unfriendly comments robot had at least two versions/iterations (although it's not in use anymore apparently, why not?).}

I think that meta Q&As are already quite effective for communication but maybe there are better ways to structure discussions about experiments. Maybe dedicated "folders" for experiments, where all Q&A related to a specific experiment are kept together? Could be realized with the tag system or something else. But in general, I'm happy with the existing framework.

I have a hard time trying to come up with general applicable guidelines, but I think I can tell you when an experiment might work when I see it. So including the community in all stages might be a good idea and try to be as clear as possible with problem and solution descriptions.

As for the size, I guess experiments can come in all sizes (1-rep-voting would be a very small change to the software), but it makes sense to go in smaller steps and check back often, but of course sometimes you have to make a larger jump if there aren't any reasonable intermediate steps available.

And we should be prepared to see experiments failing frequently. There are probably many more ways that you can do things wrong than right, but that still doesn't mean that the current state is the best possible or even close to that. Failed experiments should not result in doubts about possible system-wide change or taint ideas. One can fail, and try again, and succeed the next time.

^{And a final example: Dedicated thank you feature. 1/6 of all comments are supposed to be thank yous, we could get rid of these somehow. The company thought that a dedicated thank you button that otherwise is useless would be a good idea for that. Downside would be competition with the upvote button. After one recent experiment (that did a couple of things at the same time, which would typically be bad practice) it concluded that there is a 10% reduction in thank you comments (which isn't much) and will create the button. The metric is fine, the feature set is questionable (a button that otherwise doesn't do anything) but compete; yet the problem is not solved while there would be much better approaches (e.g. AI assisted thank you comment detection with a note to upvote instead and an automatic scheduled comment deletion after X days, which would surely find more than 10% of thank you comments and additionally would not compete with upvotes but support them especially considering the technology is quite simple nowadays). The community advocated for this. If a far suboptimal solution is also seen as fail, then this experiment failed very early, when possible solutions were sought. Everything afterwards couldn't correct for the initial conceptional problem. Some might say that this experiment wasn't really necessary in this form.}

thanks to M-- for the previous formatting, tried to improve it even more, unfortunately text quite long

Source Link

edited May 7, 2025 at 17:44

NoDataDumpNoContribution

20.7k
3
38
87

If we assume that at least some level of experimentation is desired, we should be willing to accept some disturbances during experiments and some additional cleanup effort afterwards. It will not always be avoidable. ItDisturbances of the regular operation should be reduced though to a minimum and should be communicated in a way where it's clear why it's needed. There should already before be a plan for how to recover and clean up afterwardsplan for how to recover and clean up afterwards and this step should then be executed.

Example: Recent answer bot answers are not included in data dump (good), have unclear content licenses (bad), seem to remain on the sites even though the experiment failed (bad). It would have been better if all impacts of this experiment would have been cleaned up.^{Example: Recent answer bot answers are not included in data dump (good), have unclear content licenses (bad), seem to remain on the sites even though the experiment failed (bad). It would have been better if all impacts of this experiment would have been cleaned up.}

Example: 1-rep-vote experiment (which wasn't conducted but could have), one could have recorded votes from 1-rep users for a short time only (say 2 weeks), analyze their impact, then try to undo their impact as much as possible (vote reversal is available and done kind of regularly). If some residual effects (badges, caps) remained, it would be okay in my eyes. For science.^{Example: 1-rep-vote experiment (which wasn't conducted but could have), one could have recorded votes from 1-rep users for a short time only (say 2 weeks), analyze their impact, then try to undo their impact as much as possible (vote reversal is available and done kind of regularly). If some residual effects (badges, caps) remained, it would be okay in my eyes. For science.}

Yes, experiments should sometimes be able to disable existing capabilitiesexperiments should sometimes be able to disable existing capabilities. Otherwise you would be severely limited in the scope of experiments that can be conducted and only additional capabilities could be experimented with. We could never try out to simplify anything for example.

Example: Before the beta rollout of collectives and articles, selected members of the community were consulted. The feature wasn't very well received, wasn't very successful and slowly died in the years following. It seems a bit as if only consulting a few members, doesn't give enough significant input. It's better to ask more members.^{Example: Before the beta rollout of collectives and articles, selected members of the community were consulted. The feature wasn't very well received, wasn't very successful and slowly died in the years following. It seems a bit as if only consulting a few members, doesn't give enough significant input. It's better to ask more members.}

Multiple rounds of experiments on an experimentsingle subject might be necessary in order to make it a valuable feature. The company should somehow develop more stamina in that regard and try to polish planned features more before deciding if they are worth it. Maybe it needs a vision that goes beyond more immediate engagement. Where does the company see the platform in 6-8 years?

Example: The Staging ground worked when introduced, but took multiple rounds (although I just remember that the company gave up on it in the middle of the first round). The trending sort order also took multiple rounds of forth-and-back with the community (although I just saw that it seems to not yet be available outside of SO, why not?). The unfriendly comments robot had at least two versions/iterations (although it's not in use anymore apparently, why not?).^{Example: The Staging ground worked when introduced, but took multiple rounds (although I just remember that the company gave up on it in the middle of the first round). The trending sort order also took multiple rounds of forth-and-back with the community (although I just saw that it seems to not yet be available outside of SO, why not?). The unfriendly comments robot had at least two versions/iterations (although it's not in use anymore apparently, why not?).}

I think that meta Q&As are already quite effective for communicationmeta Q&As are already quite effective for communication but maybe there are better ways to structure discussions about experiments. Maybe dedicated "folders" for experiments, where all Q&A related to a specific experiment are kept together? Could be realized with the tag system or something else. But in general I'm happy with the existing framework.

Unfortunately, often that is not done. I frequently wonder what the problem is supposed to be (example: "some members of the communities are looking for new ways to engage" might be too vague, what are they looking for?) or what the used terms mean (example: "people want to ask follow-up questions" - but what is a follow-up question and how is it different from a normal question, why can't it be explained in the text?) or what the metric will be (example: "we are looking for more engagement" - is needing to click one additional time more engagement?) or the novelty effect is not controlled (exampled: drawing a circle around voting triangles improves voting by 97%, but does it really?).^{Unfortunately, often that is not done. I frequently wonder what the problem is supposed to be (example: "some members of the communities are looking for new ways to engage" might be too vague, what are they looking for?) or what the used terms mean (example: "people want to ask follow-up questions" - but what is a follow-up question and how is it different from a normal question, why can't it be explained in the text?) or what the metric will be (example: "we are looking for more engagement" - is needing to click one additional time more engagement?) or the novelty effect is not controlled (example: drawing a circle around voting triangles allegedly improves voting by a whopping 97%, but does it really?).}

I have a hard time trying to come up with general applicable guidelines, but I think I can tell you when an experiment might work when I see it. So including the community in all stages might be a good ideaincluding the community in all stages might be a good idea and try to be as clear as possible with problem and solution description.

And we should be prepared to see experiments failing frequently. There are probably many more ways that you can do things wrong than right, but that still doesn't mean that the current state is the best possible or even close to that. Failed experiments should not result in doubts about system-wide change possible or taint ideas. One can fail, and try again and succeed the next time.

^{And a final example: Dedicated thank you feature. 1/6 of all comments are supposed to be thank yous, we could get rid of these somehow. The company thought that a dedicated thank you button that otherwise is useless would be a good idea for that. Downside would be competition with the upvote button. After one recent experiment (that did a couple of things at the same time, which would typically be bad practice) it concluded that there is a 10% reduction in thank you comments (which isn't much) and will create the button. The metric is fine, the feature set is questionable (a button that otherwise doesn't do anything) but complete but still the problem is not solved and there would be much better approaches (AI assisted thank you comment detection with a note to upvote instead and an automatic scheduled comment deletion after X days, which would surely find more than 10% of thank you comments and additionally would not compete with upvotes but support them and the technology is quite simple nowadays). The community advocated for this. If a far suboptimal solution is also seen as fail, then this experiment failed very early, when possible solutions were sought. Everything afterwards couldn't correct for the initial conceptional problem. Some might say that this experiment wasn't really necessary in this form.}

If we assume that at least some level of experimentation is desired, we should be willing to accept some disturbances during experiments and some additional cleanup effort afterwards. It will not always be avoidable. It should be reduced though to a minimum and should be communicated in a way where it's clear why it's needed. There should already before be a plan for how to recover and clean up afterwards and this step should then be executed.

Example: Recent answer bot answers are not included in data dump (good), have unclear content licenses (bad), seem to remain on the sites even though the experiment failed (bad). It would have been better if all impacts of this experiment would have been cleaned up.

Example: 1-rep-vote experiment (which wasn't conducted but could have), one could have recorded votes from 1-rep users for a short time only (say 2 weeks), analyze their impact, then try to undo their impact as much as possible (vote reversal is available and done kind of regularly). If some residual effects (badges, caps) remained, it would be okay in my eyes. For science.

Yes, experiments should sometimes be able to disable existing capabilities. Otherwise you would be severely limited in the scope of experiments that can be conducted and only additional capabilities could be experimented with. We could never try out to simplify anything for example.

Example: Before the beta rollout of collectives and articles, selected members of the community were consulted. The feature wasn't very well received, wasn't very successful and slowly died in the years following. It seems a bit as if only consulting a few members, doesn't give enough significant input. It's better to ask more members.

Multiple rounds of experiments on an experiment might be necessary in order to make it a valuable feature. The company should somehow develop more stamina in that regard and try to polish planned features more before deciding if they are worth it. Maybe it needs a vision that goes beyond more immediate engagement. Where does the company see the platform in 6-8 years?

Example: The Staging ground worked when introduced, but took multiple rounds (although I just remember that the company gave up on it in the middle of the first round). The trending sort order also took multiple rounds of forth-and-back with the community (although I just saw that it seems to not yet be available outside of SO, why not?). The unfriendly comments robot had at least two versions/iterations (although it's not in use anymore apparently, why not?).

I think that meta Q&As are already quite effective for communication but maybe there are better ways to structure discussions about experiments. Maybe dedicated "folders" for experiments, where all Q&A related to a specific experiment are kept together? Could be realized with the tag system or something else. But in general I'm happy with the existing framework.

Unfortunately, often that is not done. I frequently wonder what the problem is supposed to be (example: "some members of the communities are looking for new ways to engage" might be too vague, what are they looking for?) or what the used terms mean (example: "people want to ask follow-up questions" - but what is a follow-up question and how is it different from a normal question, why can't it be explained in the text?) or what the metric will be (example: "we are looking for more engagement" - is needing to click one additional time more engagement?) or the novelty effect is not controlled (exampled: drawing a circle around voting triangles improves voting by 97%, but does it really?).

I have a hard time trying to come up with general applicable guidelines, but I think I can tell you when an experiment might work when I see it. So including the community in all stages might be a good idea and try to be as clear as possible with problem and solution description.

And we should be prepared to see experiments failing frequently. There are probably many more ways that you can do things wrong than right, but that still doesn't mean that the current state is the best possible or even close to that. Failed experiments should not result in doubts about system-wide change possible or taint ideas. One can fail, and try again and succeed the next time.

If we assume that at least some level of experimentation is desired, we should be willing to accept some disturbances during experiments and some additional cleanup effort afterwards. It will not always be avoidable. Disturbances of the regular operation should be reduced though to a minimum and should be communicated in a way where it's clear why it's needed. There should already before be a plan for how to recover and clean up afterwards and this step should then be executed.

^{Example: Recent answer bot answers are not included in data dump (good), have unclear content licenses (bad), seem to remain on the sites even though the experiment failed (bad). It would have been better if all impacts of this experiment would have been cleaned up.}

^{Example: 1-rep-vote experiment (which wasn't conducted but could have), one could have recorded votes from 1-rep users for a short time only (say 2 weeks), analyze their impact, then try to undo their impact as much as possible (vote reversal is available and done kind of regularly). If some residual effects (badges, caps) remained, it would be okay in my eyes. For science.}

Yes, experiments should sometimes be able to disable existing capabilities. Otherwise you would be severely limited in the scope of experiments that can be conducted and only additional capabilities could be experimented with. We could never try out to simplify anything for example.

^{Example: Before the beta rollout of collectives and articles, selected members of the community were consulted. The feature wasn't very well received, wasn't very successful and slowly died in the years following. It seems a bit as if only consulting a few members, doesn't give enough significant input. It's better to ask more members.}

Multiple rounds of experiments on an single subject might be necessary in order to make it a valuable feature. The company should somehow develop more stamina in that regard and try to polish planned features more before deciding if they are worth it. Maybe it needs a vision that goes beyond more immediate engagement. Where does the company see the platform in 6-8 years?

^{Example: The Staging ground worked when introduced, but took multiple rounds (although I just remember that the company gave up on it in the middle of the first round). The trending sort order also took multiple rounds of forth-and-back with the community (although I just saw that it seems to not yet be available outside of SO, why not?). The unfriendly comments robot had at least two versions/iterations (although it's not in use anymore apparently, why not?).}

I think that meta Q&As are already quite effective for communication but maybe there are better ways to structure discussions about experiments. Maybe dedicated "folders" for experiments, where all Q&A related to a specific experiment are kept together? Could be realized with the tag system or something else. But in general I'm happy with the existing framework.

^{Unfortunately, often that is not done. I frequently wonder what the problem is supposed to be (example: "some members of the communities are looking for new ways to engage" might be too vague, what are they looking for?) or what the used terms mean (example: "people want to ask follow-up questions" - but what is a follow-up question and how is it different from a normal question, why can't it be explained in the text?) or what the metric will be (example: "we are looking for more engagement" - is needing to click one additional time more engagement?) or the novelty effect is not controlled (example: drawing a circle around voting triangles allegedly improves voting by a whopping 97%, but does it really?).}

I have a hard time trying to come up with general applicable guidelines, but I think I can tell you when an experiment might work when I see it. So including the community in all stages might be a good idea and try to be as clear as possible with problem and solution description.

And we should be prepared to see experiments failing frequently. There are probably many more ways that you can do things wrong than right, but that still doesn't mean that the current state is the best possible or even close to that. Failed experiments should not result in doubts about system-wide change possible or taint ideas. One can fail, and try again and succeed the next time.

^{And a final example: Dedicated thank you feature. 1/6 of all comments are supposed to be thank yous, we could get rid of these somehow. The company thought that a dedicated thank you button that otherwise is useless would be a good idea for that. Downside would be competition with the upvote button. After one recent experiment (that did a couple of things at the same time, which would typically be bad practice) it concluded that there is a 10% reduction in thank you comments (which isn't much) and will create the button. The metric is fine, the feature set is questionable (a button that otherwise doesn't do anything) but complete but still the problem is not solved and there would be much better approaches (AI assisted thank you comment detection with a note to upvote instead and an automatic scheduled comment deletion after X days, which would surely find more than 10% of thank you comments and additionally would not compete with upvotes but support them and the technology is quite simple nowadays). The community advocated for this. If a far suboptimal solution is also seen as fail, then this experiment failed very early, when possible solutions were sought. Everything afterwards couldn't correct for the initial conceptional problem. Some might say that this experiment wasn't really necessary in this form.}

It was hard to follow the text, without any formatting. Feel free to rollback if you don't agree.

Source Link

edited May 7, 2025 at 15:42

M--

6k
2
19
51

If we assume that at least some level of experimentation is desiredsome level of experimentation is desired, we should be willing to accept some disturbances during experiments and some additional cleanup effort afterwards. It will not always be avoidable. It should be reduced though to a minimum and should be communicated in a way where it's clear why it's needed. There should already before be a plan for how to recover and clean up afterwards and this step should then be executed.

Failed experiments should clean up all their impacts as much as possible.Failed experiments should clean up all their impacts as much as possible.

Some disruption is a reasonable price to pay for a gain in knowledge.Some disruption is a reasonable price to pay for a gain in knowledge.

From the past, I would say that the whole community, not only selected members should be consultedthe whole community, not only selected members should be consulted already during the planning phase of an experiment.

What's striking to me though is the half-bakednesshalf-bakedness of it all. Collectives or discussionsDiscussions received very little changes after the initial rollout. It's almost like the company didn't want the features to succeed and looks a bit like a waste of resources.

Multiple rounds of experiments on an experiment might be necessary in order to make it a valuable feature.Multiple rounds of experiments on an experiment might be necessary in order to make it a valuable feature. The company should somehow develop more stamina in that regard and try to polish planned features more before deciding if they are worth it. Maybe it needs a vision that goes beyond more immediate engagement. Where does the company see the platform in 6-8 years?

All in all a mixed picture there. Following through on their plans and optimizing new features seem to be weak points.All in all a mixed picture there. Following through on their plans and optimizing new features seem to be weak points.

Now the difficult part: minimal feature sets and success metricsminimal feature sets and success metrics.

There is much room to improve.There is much room to improve.

If we assume that at least some level of experimentation is desired, we should be willing to accept some disturbances during experiments and some additional cleanup effort afterwards. It will not always be avoidable. It should be reduced though to a minimum and should be communicated in a way where it's clear why it's needed. There should already before be a plan for how to recover and clean up afterwards and this step should then be executed.

Failed experiments should clean up all their impacts as much as possible.

Some disruption is a reasonable price to pay for a gain in knowledge.

From the past, I would say that the whole community, not only selected members should be consulted already during the planning phase of an experiment.

What's striking to me though is the half-bakedness of it all. Collectives or discussions received very little changes after the initial rollout. It's almost like the company didn't want the features to succeed and looks a bit like a waste of resources.

Multiple rounds of experiments on an experiment might be necessary in order to make it a valuable feature. The company should somehow develop more stamina in that regard and try to polish planned features more before deciding if they are worth it. Maybe it needs a vision that goes beyond more immediate engagement. Where does the company see the platform in 6-8 years?

All in all a mixed picture there. Following through on their plans and optimizing new features seem to be weak points.

Now the difficult part: minimal feature sets and success metrics.

There is much room to improve.

If we assume that at least some level of experimentation is desired, we should be willing to accept some disturbances during experiments and some additional cleanup effort afterwards. It will not always be avoidable. It should be reduced though to a minimum and should be communicated in a way where it's clear why it's needed. There should already before be a plan for how to recover and clean up afterwards and this step should then be executed.

Failed experiments should clean up all their impacts as much as possible.

Some disruption is a reasonable price to pay for a gain in knowledge.

From the past, I would say that the whole community, not only selected members should be consulted already during the planning phase of an experiment.

What's striking to me though is the half-bakedness of it all. Collectives or Discussions received very little changes after the initial rollout. It's almost like the company didn't want the features to succeed and looks a bit like a waste of resources.

Multiple rounds of experiments on an experiment might be necessary in order to make it a valuable feature. The company should somehow develop more stamina in that regard and try to polish planned features more before deciding if they are worth it. Maybe it needs a vision that goes beyond more immediate engagement. Where does the company see the platform in 6-8 years?

All in all a mixed picture there. Following through on their plans and optimizing new features seem to be weak points.

Now the difficult part: minimal feature sets and success metrics.

There is much room to improve.

added 81 characters in body

Source Link

edited May 7, 2025 at 11:39

NoDataDumpNoContribution

20.7k
3
38
87

Loading

Source Link

answered May 7, 2025 at 11:28

NoDataDumpNoContribution

20.7k
3
38
87

Loading

Stack Exchange Network

Return to Answer

Post Timeline