Questions tagged [descriptive-statistics]
Descriptive statistics summarize features of a sample, such as mean and standard deviations, median and quartiles, the maximum and minimum. With multiple variables, may include correlations and crosstabs. Can include visual displays - boxplots, histograms, scatterplots and so on.
1,841 questions
5
votes
3
answers
514
views
How to estimate the standard deviation of a ratio?
I’m working with summary statistics (means, standard errors, and sample sizes) for two variables, and I’ve derived:
SD of variable A from its standard error and sample size
The ratio (P = A / B ) (e....
0
votes
0
answers
40
views
Problem with data cleaning
The Union of India has undergone frequent political re-organizations since independence. The problem today (for me) is that, I've been unable to account for certain data values of the following states/...
1
vote
0
answers
40
views
Sample standard deviation of a group of samples
I'm in introductory statistics. I have an idea of the answer here but I am unsure.
In this problem I have four random samples from a population of 60, each with a size of $n=10$. After calculating the ...
3
votes
1
answer
81
views
How can I best describe my data with descriptive statistics?
Preface
This is a follow-up to my previous question about the same data, where there is extended discussion in the comments. The scope is different, as I have now realised that I was not asking the ...
0
votes
0
answers
26
views
What's the magic number in modified z-score? [duplicate]
I often see the this formula used for the modified z-score:
$$
M_i = {{0.6745(x_i - \tilde{x}_i)}\over{MAD}}
$$
While the rest of the formula makes sense, the inclusion of $0.6745$ (what programmers ...
1
vote
0
answers
38
views
What type of statistical test should I use to know if time per order decreased per person after a protocol change?
I have a large set of data from hospital providers. Many providers were unnecessarily ordering extended monitoring for patients on a remote monitoring system. A new protocol went into effect to ...
2
votes
2
answers
156
views
Bayesian and non-Bayesian way to multiply two proportions and propagate uncertainty
I am interested in estimating something like the proportion of Americans who are car owners and own red cars. Let's say I estimate this likeso:
P1: The proportion of Americans who are car owners --&...
10
votes
1
answer
553
views
Does it make sense to average standard errors like this?
I have a collection of many experiments. In each experiment I do a simple linear regression where I get a regression parameter $\theta_i$ and I also compute the standard error of $\theta_i$ which I ...
4
votes
1
answer
291
views
Best hypothesis testing methods for non probabilistic data
In my qualitative content analysis dataset, 17 out of 20 case studies support AI enhanced NPV (p = 17/20= 0.85) and 3 out of 20 use traditional NPV (p = 3/20= 0.15). In reality, 1000's of companies ...
-2
votes
1
answer
80
views
Looking for a free statistical analysis software to perform Descriptive Statistics on two interlinked Data Sets [closed]
I am looking for a free statistical analysis software (like GNU PSPP, jasp-stats, jamovi) to perform (mainly) Descriptive Statistics, on a big set of data. However, I want to stay away from Excel.
I ...
1
vote
0
answers
31
views
Volatility Estimation and forecasting for Ultra-High frequency data
I am unable find methods for Volatility Estimation in ultra-high frequency settings. I am aware HAR-RV and it's counterparts. These models seem to estimate daily volatility using high-frequency data. ...
0
votes
1
answer
65
views
Guidance for communicating insights to inform breakdown companies how to assess breakdown risk [closed]
I come from a machine learning background, however I am trying to learn more traditional data science. I have a dataset of vehicles and the target is the Breakdown Likelihood (1 to 3, 1 being lowest), ...
3
votes
1
answer
319
views
What type of data is data on the number of shark attacks in different regions as registered in ISAF from 2004-2013
I am aware that this type of data corresponds to spatial series, since it varies over different locations. But apart from this, can we also classify it as a discrete and quantitative variable, ...
9
votes
2
answers
809
views
Is neutrally skewed the correct interpretation of a box plot with equal length arms?
Consider a box and whisker plot if both arms are the same length. Is it correct to say it is neutrally skewed or an even distribution?
2
votes
1
answer
119
views
Quantitatively determining unexplored parameter spaces [closed]
If we have a high-dimensional dataset (7-10 columns) of continuous variables like Time, Temperature etc. recorded from experiments (not performed by us) are there established methods to quantitatively ...