How to find the Poisson distribution that best fits this observed data in Mathematica?

Question

I am working with the following data, originally from Rice’s Mathematical Statistics and Data Analysis (page-256), representing the number of alpha particle counts per 10-second interval:

    (* Number of clicks per 10-second interval *)
clicksPerTenSeconds = Range[0, 17];

(* Observed counts per interval *)
observedCounts = {1, 6, 11, 28, 56, 105, 126, 146, 164, 161, 
  123, 101, 74, 53, 23, 15, 9, 5};

(* Create list of points *)
points = Table[{clicksPerTenSeconds[[i]], observedCounts[[i]]}, 
   {i, Length[clicksPerTenSeconds]}];

(* Plot observed data *)
ListLinePlot[
  points,
  PlotMarkers -> {"o", Medium},
  PlotStyle -> Blue,
  PlotRange -> All,
  Frame -> True,
  FrameLabel -> {"Number of Clicks per 10-Second Interval", "Observed Counts"},
  FrameTicks -> {{Range[0, 160, 20], None}, {Range[0, 17, 1], None}},
  PlotLabel -> "Observed Alpha Particle Emission per 10-Second Interval",
  ImageSize -> Large,
  GridLines -> {Range[0, 17, 1], Range[20, Max[observedCounts], 20]},
  GridLinesStyle -> Directive[Gray, Dashed],
  LabelStyle -> {Black, 16, FontFamily -> "Times"]
]

This produces the plot of observed counts vs. the number of clicks per interval. Now I would like to find the Poisson distribution $e^{-\lambda} \lambda^{k}/k!$ that best fits this data?

Now I wish proceed by calculating the sample mean and take that as the best guess of $\lambda$ and using that value of $\lambda$ plot that fit with the actual data plot given above. How can one achieve this?

The data you present isn't exactly the data provided in the referenced text. The first frequency count is associated with 0, 1, and 2 clicks per interval and the last frequency count is associated with 17 or more clicks per interval. Why are you changing the data? — JimB
– JimB, Commented yesterday

Joshua Schrier · Accepted Answer · 2025-10-30 17:17:13Z

@ydd's answer takes advantage of the fact that the parameter (lambda) in the PoissonDistribution is the sample mean.

Another, more general way to calculate this is to perform a maximum likelihood parameter estimate using EstimatedDistribution:

(* data from original problem *)
clicksPerTenSeconds = Range[0, 17];
observedCounts = {1, 6, 11, 28, 56, 105, 126, 146, 164, 161, 123, 101, 74, 53, 23, 15, 9, 5}; 

(* perform maximum likelihood estimation *)
data = WeightedData[clicksPerTenSeconds, observedCounts];
distribution = EstimatedDistribution[data, PoissonDistribution[lambda]]

(* plot comparison of results *)
Show[
 Histogram[data, {1}, "PDF", AxesLabel -> {"Clicks / 10 seconds", "PMF"}],
 DiscretePlot[PDF[distribution, x], {x, 0, 17}, Joined -> True]]

ydd · Accepted Answer · 2025-10-30 16:23:11Z

4

Use WeightedData:

w = WeightedData[clicksPerTenSeconds, observedCounts];

λ = Mean[w] // N
(*8.36288*)

We can also calculate the mean ourselves:

clicksPerTenSeconds . observedCounts/Total[observedCounts]//N
(*8.36288*)

edited yesterday

answered yesterday

ydd

8,7441 gold badge12 silver badges28 bronze badges

$\begingroup$ Thanks @ydd, I'm a beginner to both Mathematica and Probability theory. Could you elaborate on "what actually WeightedData[clicksPerTenSeconds, observedCounts] does here?" $\endgroup$

seeker
– seeker

2025-10-30 16:04:30 +00:00
Commented yesterday
$\begingroup$ It represents the observation values in clicksPerTenSeconds (so 0 clicks, 1 click, ... 17 clicks) with weights observedCounts (so 0 clicks was observed 1 time, 1 click was observed 6 times, ..., 17 clicks was observed 5 times). It may help to read the WeightedData documentation $\endgroup$

ydd
– ydd

2025-10-30 16:07:35 +00:00
Commented yesterday

Add a comment |

JimB · Accepted Answer · 2025-10-31 03:47:56Z

The data in the referenced text has two complicating factors: (1) the first group has the count for 0, 1, and 2 clicks per interval and (2) the last group has the count for 17 or more clicks per interval. That dataset is from a censored distribution.

If there wasn't that complication (the censoring), the maximum likelihood estimator is just the sample mean as given by @ydd.

To find the maximum likelihood estimator for the data with the censoring one needs to construct the likelihood. Then one finds the value of $\lambda$ that maximizes the likelihood or the log of the likelihood (which is more numerically stable).

To construct the likelihood the first and last count requires the cmf (cumulative mass function) of the Poisson distribution and the rest requires the pmf (probability mass function).

data = {{2, 18}, {3, 28}, {4, 56}, {5, 105}, {6, 126}, {7, 146}, 
  {8, 164}, {9, 161}, {10, 123}, {11, 101}, {12, 74}, {13, 53}, 
  {14, 23}, {15, 15}, {16, 9}, {17, 5}};

likelihood = CDF[PoissonDistribution[\[Lambda]], 2]^data[[1,2]]*
    Product[PDF[PoissonDistribution[\[Lambda]], data[[i, 1]]]^data[[i, 2]], {i, 2, Length[data] - 1}]*
    (1 - CDF[PoissonDistribution[\[Lambda]], 16])^data[[Length[data],2]];
logL = Log[likelihood] //. {Log[a_ b_] -> Log[a] + Log[b], 
    Log[Exp[a_]] -> a, Log[a_^b_] -> b Log[a]};
Plot[logL, {\[Lambda], 5, 12}]

We see that the maximum occurs around $\lambda=8.5$.

mle = FindMaximum[logL, {{\[Lambda], 8.5}}]
(* {-2990.67, {\[Lambda] -> 8.36919}} *)

Using Mathematica's probability functions one can also do the following to obtain the maximum likelihood estimate:

data = {{2, 18}, {3, 28}, {4, 56}, {5, 105}, {6, 126}, {7, 146}, {8, 
    164}, {9, 161}, {10, 123}, {11, 101}, {12, 74}, {13, 53}, {14, 
    23}, {15, 15}, {16, 9}, {17, 5}};
dist = CensoredDistribution[{2, 17}, PoissonDistribution[\[Lambda]]]
mle = FindDistributionParameters[WeightedData[data[[All, 1]], data[[All, 2]]], 
  dist, {{\[Lambda], 8.5}}]
(* {\[Lambda] -> 8.36919} *)

A next step might be performing a goodness-of-fit test and if the fit for a Poisson seems reasonable followed by finding a measure of precision for the estimate of $\lambda$

azerbajdzan · Accepted Answer · 2025-10-31 10:05:48Z

This is answer to the other half of the question - how to plot best fit over OP's original plot.

Take your favorite λ estimate from other answers or from the linked PDF file. I have chosen λ -> 8.36.

(*Number of clicks per 10-second interval*)
clicksPerTenSeconds = Range[0, 17];

(*Observed counts per interval*)
observedCounts = {1, 6, 11, 28, 56, 105, 126, 146, 164, 161, 123, 101,
    74, 53, 23, 15, 9, 5};

(*Create list of points*)
points = Table[{clicksPerTenSeconds[[i]], observedCounts[[i]]}, {i, 
    Length[clicksPerTenSeconds]}];

(*Plot observed data*)
ListLinePlot[points, PlotMarkers -> {"o", Medium}, PlotStyle -> Blue, 
  PlotRange -> All, Frame -> True, 
  FrameLabel -> {"Number of Clicks per 10-Second Interval", 
    "Observed Counts"}, 
  FrameTicks -> {{Range[0, 160, 20], None}, {Range[0, 17, 1], None}}, 
  PlotLabel -> 
   "Observed Alpha Particle Emission per 10-Second Interval", 
  ImageSize -> Large, 
  GridLines -> {Range[0, 17, 1], Range[20, Max[observedCounts], 20]}, 
  GridLinesStyle -> Directive[Gray, Dashed], 
  LabelStyle -> {Black, 16, FontFamily -> "Times"}];

(* plot with estimated λ *)

Plot[Total@observedCounts E^-λ λ^k/k! /. λ -> 
    8.36, {k, 0, 17}, PlotStyle -> ColorData[97, 2]];

Show[%%, %]

Stack Exchange Network

How to find the Poisson distribution that best fits this observed data in Mathematica?

4 Answers 4

Hot Network Questions

How to find the Poisson distribution that best fits this observed data in Mathematica?

4 Answers 4

Related

Hot Network Questions