What should I do about slight human errors in my data?

Question

I just graduated with my MSc in biology and I have submitted my thesis and defended (December 2025). I am now starting the off-boarding process and I am writing up a detailed protocol from my thesis for the next potential grad student.

One of the things that I did was to show how I analyzed my data. I was looking over my control and found that when I did the analysis this time, it doesn’t exactly match my initial analysis. The analysis involves counting mouse cells by hand from images I took from an epi-fluorescent microscope. Probably the most frustrating and tedious thing I have ever done. I did this analysis a long time ago and I made the big mistake by not checking it twice. The trend is still the same but the numbers aren’t matching up like they did the first time.

I’m not sure what happened or how this happened (I’m assuming I was overcaffeinated and sleep-deprived), but it’s been killing me for the past two days. I think I have only slept three hours. Overall the trend is still the same, but I’m just not getting the same number. I also redid the stats and the p-values that I got are basically the same (significant with the same number of decimal places).

Has anyone ever been in this situation? I’m not sure if I should just leave it as is or try to correct my thesis. I was thinking of having another grad student or undergrad count the cells for me as another verification.

Also, is it normal for an advisor to not check the raw data? I feel like if I had another set of eyes this would have been caught.

Has anyone been through this? Am I just overthinking it and should I just leave it be?

A part of me wants to just let it go because I’m done, but I’m feeling very high anxiety and shame that a senior grad student could let something like this pass. I’m super disappointed with myself.

Just because the second count is different from the first doesn't mean it is correct and the first is wrong: it could be the other way round (or, more likely, both are different to "the one true figure"). — TripeHound
– TripeHound, Commented Apr 26 at 10:10
Can you use software to count the cells for you? Specially trained image recognition software is often used for counting things in cluttered images. If there isn't existing software tailored for this purpose, you might have to train a model to do it, which might be more work than counting the cells yourself, but it's worth looking into at least. Good software would be able to output a marked-up image (e.g. with a small white cross displayed on each object that it recognizes) for verification (e.g. if the crosses are misplaced, you know the model is miscounting). — Kevin
– Kevin, Commented Apr 27 at 18:17
Presuming you have a camera mounted to the microscope, in your report for the next, you might hint to ImageJ / Fiji for an automated cell count (example). — Buttonwood
– Buttonwood, Commented 2 days ago

David White · Accepted Answer · 2026-04-28 22:00:31Z

Congratulations on finishing your thesis, and I'm sorry to hear about the stress and lack of sleep. Try not to be so hard on yourself. Think about the cell count mismatch as measurement variability. I doubt anyone could have gotten a perfect count. Humans make mistakes and that is to be expected. That said, it's good to take this seriously and check that the research findings are not sensitive to that variability. Be assured that this kind of situation is fairly common, especially with manual, tedious analysis like hand-counting cells. If you redo it later and the numbers don't match exactly, that's not surprising. The important question is whether the scientific conclusion changes. In your case, it sounds like it doesn't: the trend is the same and the p-values are essentially unchanged.

Since the conclusions didn't change, I would not panic or lose sleep over this. If you still have time and access to the data, the cleanest thing to do would be:

redo the counts carefully (or have someone else independently count a subset),
update the numbers if they shift slightly, and
document the procedure clearly in your protocol. But I would not go down a rabbit hole trying to reconcile every single discrepancy from months ago. Manual counting is not perfectly reproducible at the single-cell level.

On your other question: yes, it is normal that your advisor did not check raw data at that level of detail. Advisors check that things look plausible and reasonable, and often focus more on the research questions and writing quality. In most fields, it is the student's responsibility to double check the data quality.

The fact that you went back, noticed it, and are thinking carefully about it is exactly what a good researcher does. If the conclusions stand, you did your job. I would recommend documenting somewhere for the sake of future readers that the cell counts don't quite match what's in the data, including the correct counts, and showing that this didn't change the findings. That could be an erratum, or a note in a preprint version online (e.g., bioRxiv), or as part of a supplement to the thesis if it can be added.

Hi David, thank you for your response. I really do appreciate it. I've been having some serious anxiety about this and I've been beating myself up over this. It sucks that I don't have a senior person to ask advice in my lab. I am going to take a break from this and then come back to finish the document. I will make sure to make a note of it for the next generation. This has been one of my fears and it might take sometime but I think it will pass for me. I am going to train some undergrads next month, I will have them reanalyze my data. — Matrix_Error
– Matrix_Error, Commented Apr 25 at 22:49
@Matrix_Error I'm torn between making this its own answer, or a comment here, but remember: You have basically the same p-value, and the same conclusions and implications. Alert your advisor, ask about how errata or their equivalent work. To answer your direct question, you are not the first person and will not be the last-- especially at the masters degree level-- to have discovered a late-breaking error. Breathe. You're okay. — Anonymous
– Anonymous, Commented Apr 26 at 0:33
There is a reasonable chance, by the way, that this doesn't even rise to the level of errata or a correction. — Anonymous
– Anonymous, Commented Apr 26 at 0:34
Thank you! I really do appreciate the answer and thoughtful words. I'm still learning that this type of stuff can happen and it's not the end of the world. — Matrix_Error
– Matrix_Error, Commented Apr 26 at 6:24

pyrochlor · Accepted Answer · 2026-04-27 07:12:27Z

13

Are you within the measurement error of this kind of experiment?

Every measurement, every experimental data, is "wrong". Nothing is perfect. So one has, best before you start with experiments, think about what kind of deviation is reasonable. The recount is 502 instead of 499 ? Well thats probably within the expected range. That means for experimental science, they are the "same" value. The count is 10 instead of 15? Off by a third, that doesn't sound right.

Getting the exact same values is often the start of how fraud is detected in publications. You never measure exactly the same data twice. That does not mean that one of that measurements is wrong, it just means that every real world data taking has a finite accuracy.

So my advice: Get a feeling for how big your deviations are (percentage puts it in perspective) and get an idea of how much of a deviation can be expected.

answered Apr 27 at 7:12

pyrochlor

6,76812 silver badges43 bronze badges

4

Also worth noting that the manual counting is just one source of variability in the data. Maybe some cells didn't stain properly, or some died in handling, or there are image artifacts, etc. The cell count might differ from the "true" number of cells for a number of reasons, of which the manual counting step is just one - even if you count exactly correctly, there might be an issue with the image. You'd get slightly different results if you redid any step of the data generation process. If you count well but stain poorly, for example, counting variability shouldn't be your biggest concern.

Nuclear Hoagie
– Nuclear Hoagie

2026-04-28 13:44:29 +00:00
Commented 2 days ago

Add a comment |

Bryan Krause · Accepted Answer · 2026-04-27 14:41:01Z

I would not expect your advisor or anyone else to check the entirety of your work. It would be good to do some validation on a subset to get an idea of reliability (both your own counts and inter-rater), but if you've already submitted your thesis and didn't do this, I would not worry at all. However, some things your advisor probably should have been aware of and should have trained you on/prepared you for might be:

Error is normal in manual cell counts, I would not blink at discrepancies of 5-10%, worse discrepancies may occur when the task is more difficult (for example, if you're counting something difficult like morphology).
Automated methods are also going to have some intrinsic error rate; even if they're able to give you exactly the same count with the same images, that just means they're making the same errors repeatedly.
The more important thing to concern yourself with is not measurement error but measurement bias. An example source of bias would be if you started with your control samples, and then your test samples. Or if you always start your day with a control sample and then do a test sample. Your counts might change either as you become more of an expert at the task or as you tire of it. Much better to randomize the order.

If you've gotten different counts but the same statistical result, then nothing has meaningfully changed.

Gilbert · Accepted Answer · 2026-04-27 16:39:38Z

I want to reinforce what other answers have said more or less directly: Counting is a measurement, and every measurement has an associated uncertainty. This is a fact of science and not a moral failing; please do not be ashamed.

In many counting measurements, the uncertainty is effectively zero, so we often ignore it. Your case is clearly one for which the counting uncertainty is nonzero. Whether this uncertainty is significant seems to have been indirectly determined as "no" by the fact that your statistics remain unchanged with the second count. But if you wanted to be most thorough, you could incorporate the counting uncertainty explicitly into your analysis. You have two data points, so you can estimate a variance.

Stack Exchange Network

What should I do about slight human errors in my data?

4 Answers 4

You must log in to answer this question.

Linked

Hot Network Questions

What should I do about slight human errors in my data?

4 Answers 4

You must log in to answer this question.

Linked

Related

Hot Network Questions