59

Stack Exchange has recently been working on improvements to the process by which you escalate issues to our attention and by which we are transparent about what we work on in kind. Our transparency reporting process is described in this post, but it is long overdue for critical changes. Importantly, we are rethinking what metrics we’ve been reporting out.

You’ve probably already seen some of the efforts meant to address the growing backlog that have been taking place in this past year. These include the Community Asks Sprint initiative, Slate’s post clarifying tag usage, and our recent rekindling of a bug duty rotation for Public Q&A (ok, that last one maybe you’ve only noticed if you’re looking very closely, ‘cause there is no accompanying Meta communication). In tandem with those, we’ve been working on a new way to report how we’re faring in responding to your requests. So, without further ado...

Introducing a new metric for measuring backlog health: P80s

The new metric we’ll be using internally to measure how we’re performing and to help surface issues that need attention is the “P80 metric.” P80 expresses the idea that “80% of all work items should be handled within X weeks,” and that most items should not sit indefinitely without a resolution. That X will vary for each of the three process tags (as defined on Slate’s post clarifying tag usage). For instance, to reflect the belief that we should triage issues in a timely manner even if it then takes us some time to reach an outcome for them, has a shorter P80 target than the other two process tags.

Given the age of some of the issues currently in our backlog, it’ll take a few quarters before we meet what we consider to be reasonable thresholds for each of these tags. This means our initial goals won’t be to meet the long-term targets, but rather, the time will be spent getting the backlog under control. Our primary aim is to make sure it doesn’t keep aging and that the P80 doesn’t keep increasing — and only after that will we be in a better position to start improving our response times. In the long run, we’d like to be able to target 8 weeks (4 sprints - our first response time) for , 16 weeks (8 sprints) for , and 40 weeks (20 sprints) for . Please note that these are tentative goals, and that once we’re closer to meeting them, they might still be subject to revision depending on internal resource allocation, in the same way that the previous reporting was subject to adjusting depending on those same factors.

As an example of how this work in practice, using an 8-week target for :

  • Assume there are 100 posts with the tag.
  • Each post has had the tag for some number of days.
  • Order all the posts by the number of days they’ve been in .
  • The 80th post on that list determines the P80 metric.
  • For example, if the 80th post has been in for 7 weeks, we are meeting our target. But if the 80th post has been in for 9 weeks, we are not.

This metric acknowledges that some issues will simply take longer — possibly much longer - than the target time to finish. This is a normal and natural part of doing business. However, 80% of the issues pending in any given week should be less than 8 weeks old.

Note that transitioning an issue to a state it’s previously been in doesn’t reset its time in that status. As an example, suppose a post spends 2 weeks in , then 6 weeks in . If the post is then moved back to , it will begin counting up from 2 weeks, not from 0 again, reflecting the total amount of time the issue has spent in . This prevents us from deferring handling an issue indefinitely on accident by moving it between statuses. We expect that folks will sometimes make small prioritization errors they later correct, and this should be accounted for seamlessly.

Reporting P80s

External reporting will take place on Meta, in the same post where it’s taken place historically. It will still take place once a quarter, and will be composed of an aggregate report. A report will look something like this:

Status tag Posts in tag P80 last quarter P80 current quarter Goal (if applicable)
Review xxx xxx days xxx days xxx days by yyy / decreasing trend
Planned xxx xxx days xxx days xxx days by yyy / decreasing trend
Deferred xxx xxx days xxx days xxx days by yyy / decreasing trend

Alongside this, we’ll also include:

  • The past two years of P80 data for each tag in chart format.
  • A list of important Meta posts we’ve addressed as a part of this process
  • Escalation guidance (which generally will match the quarterly roadmap) for what should be escalated on that quarter.

Conclusion

We hope you see an improvement in this change, and that it revitalizes trust in how we handle your escalations internally. Of course, we don’t think changing what we measure, in and of itself, will solve the accumulating backlog of reports, but we do believe that you can’t control what you can’t measure. This metric provides us with clear, actionable steps to ensure success, and those steps are broadly in agreement with what we feel this process needs to meet the needs of the user base.

I should be clear about the fact that these metrics are primarily internal-facing and that we share them publicly for transparency and because we want to hold ourselves accountable to you — this means the internal reporting will be taking the form described above. As such, while I’m open to small corrections, requests for clarifications, and such, I’m much more interested in serious objections to the plan laid out in this post as it relates to the public reporting specifically.

Leaving answers instead of comments is generally easier to thread discussions, so please err on the side of doing that, even if the answers are short. Splitting your concerns to one topic per answer also makes following the discussion much easier.

Some of the details might change between now and a final version (which will be taking shape as an edit to the overall process post), but unless y’all feel I’m going completely in the wrong direction with the public reporting, this is gonna to be how the process will look in the future. Lemme know your thoughts below!

9
  • 3
    I'm not a Developer (or a Project Manager) but isn't this just a recipe for your dev team to cherry-pick the top 80 easiest tasks every month, and leave the hardest ones to rot?
    – Richard
    Commented Apr 1 at 19:40
  • 5
    @Richard Top 80 %. While it still allows the most difficult tasks to "rot", most of the tasks should be handled comparatively quickly :) Commented Apr 1 at 20:16
  • is the sprint to calendar-time conversion rate 2 sprints = 1 year?
    – cfr
    Commented Apr 6 at 5:15
  • 1
    Sprints are 2-weeks-long, @cfr. The section where I list the tentative targets should make that cleat, but let me know if I can clarify further.
    – JNat StaffMod
    Commented Apr 7 at 8:16
  • What about interdepartmental issues such as this?
    – A-Tech
    Commented Apr 9 at 9:46
  • 1
    What about them, @A-Tech?
    – JNat StaffMod
    Commented Apr 9 at 9:49
  • This reads to me as that this is a metric the dev/pm team has set out as a goal. But will say the legal department prioritize questions and injuries that stem from [status-review] as well to help meet that goal?
    – A-Tech
    Commented Apr 9 at 9:56
  • 6
    Issues like the one you're referencing will generally be triaged to the Trust & Safety team, who will work with the legal department. The T&S team is one of the teams responsible for meeting these goals and targets, so I'm sure they'll do their best to respond to requests on a timely manner. Is that what you mean?
    – JNat StaffMod
    Commented Apr 9 at 10:07
  • It has been on review for half a year so I'm not sure we'll agree about the timely bit @JNat
    – Mast
    Commented Apr 20 at 13:58

5 Answers 5

26
+50

If the company isn't meeting its metrics with regards to P80s and other metrics, would increasing resources towards public platform be on the table? What levels of backlog would result in a relook at resource allocations - especially in the view of bug duty rotations apparently pausing at some point?

1
  • 14
    There are a few approaches to solving for not meeting response targets where I might be part of the group that makes a decision, but hiring and staffing is well beyond my paygrade. Importantly, though, if the company is not meeting the targets the metrics will signal that that's the case. The process involves several departments and teams, so if we find ourselves not meeting the targets continuously that will beget answers for why that is the case in order to determine the solutions.
    – JNat StaffMod
    Commented Mar 31 at 13:52
7

Will the P80 target only be looked at from an "all sites" perspective, or will there be any analysis for a per site target. For example, some sites have (many) experiments, some of which can/do need improvements (looking at you, Discussions), or change that go community wide can affect some sites quite different to others, so some meta site have many posts in or . If, for example, overall you are meeting P80, but you have a site in the community which is failing the P80 target miserably, with a good volume of (total) posts needing work efforts, would that be considered a failure?

Admittedly, at the time of writing only Stack Exchange Meta and Stack Overflow Meta have a "signifcant" volume, though, Code Golf Meta does have 17 posts in a status that denotes work could occur. SEDE

5
  • 4
    There are currently no plans to break the reporting down by site. If we observe a site is heavily skewing the stats, one way or the other, we'll consider whether it makes sense to slice it by site ;)
    – JNat StaffMod
    Commented Apr 7 at 15:16
  • 1
    I expect per-site metas to in general have less intermediate-status-tagged posts than main-meta or SO-meta (e.g. Puzzling has only one status-review) so per-site numbers would be subject to quite a bit of noise.
    – bobble
    Commented Apr 7 at 15:18
  • Yeah, Code Golf has the most posts of a tag that isn't Meta Stack Exchange/[meta.so], @bobble , with 16 [status-deferred] posts. SEDE
    – Larnu
    Commented Apr 7 at 15:32
  • 2
    A little bonus context: A meaningful/useful measurement of P80 requires a decent number of posts with a given status tag. Plus, if we want to measure the response time network-wide, then incorporating the many issues only reported on MSO is required. But if we want to measure P80 on a small site with just 2-10 posts in review/deferred, well... we can't (what's the 80th %ile of 3 posts?), unless we pool them together with the broader network. That's a more suitable decision anyway, since P80 is supposed to be a "big picture" metric. Response times per-site may still vary.
    – Slate StaffMod
    Commented Apr 7 at 16:21
  • I do mention this in the post, @Slate: : "community which is failing the P80 target miserably, with a good volume of (total) posts needing work efforts" I wouldn't expect a site with 2 status posts to be in scope.
    – Larnu
    Commented Apr 7 at 16:33
5

With ongoing projects, current issues and feature requests related to these are usually tracked by answers to a main question. I'm aware question tags are automatically monitored by the ticketing system, is there a process in place for issues surfaced in answers to formal requests for feedback, and is there anything on the moderator/regular user end we need to be aware of?

2
  • 6
    There is no formal process in place to monitor such tags in answers, no. It's my understanding that it's a practice folks have taken up across departments, but not quite a standardized one. That being the case, I believe each team/individual who uses that system will have their own way of ensuring the answers the tags get added to are accounted for somewhere. I'll inquire and see if there's a need to standardize this, especially if there are a lot of such answers that end up getting stuck with a process tag for a long time.
    – JNat StaffMod
    Commented Mar 31 at 13:41
  • Speaking generally, Meta sites (network wide) could benefit from some new/improved post notices to mark updates/resolution from CMs/other staff/mods.
    – Robotnik
    Commented Apr 4 at 4:17
4

Are you planning to (try to) backfill the P80 data for the last two years, or will the graph start with just a single data point?

1
  • 7
    The reporting will include "The past two years of P80 data for each tag in chart format." ;)
    – JNat StaffMod
    Commented Mar 31 at 14:52
0

I've certain tickets that are not status review - and submitted outside meta, for example via a contact form, partially cause I'm hoping a quieter approach would avoid drama, or in theory it's something sensitive. One of those is likely to hit about a year in a potentially open status, and is unresolved.

Are 'non meta' tickets also under P80?

3
  • Are these tickets you've sent in via the contact us form? CM escalations? Something else?
    – JNat StaffMod
    Commented Apr 22 at 7:50
  • Well, things with a formal ticket after escalation via the Contact form - specifically its the one about the full dump(from august '24 - so not quite, but I've not seen any real movement on this. The initial "informal" request is from jul or earlier ). Some community requests are complicated . There's potential for others depending - So practically, I'd consider this in terms of 'generic' issues that've gotten a ticket of some form rather than my specific, very complicated example. Other examples I'd consider would be things like mod resignations which sometimes slip under the cracks. – Commented Apr 22 at 8:01
  • None of those cases are covered by this metric, no
    – JNat StaffMod
    Commented 2 days ago

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.