Questions tagged [statistics]
Statistics is a scientific approach to inductive inference and prediction based on probabilistic models of the data. By extension, it covers the design of experiments and surveys to gather data for this purpose.
1,113 questions
3
votes
1
answer
66
views
Please make a distinction between a linear model and a genalized linear model in statistical way?
These concepts seem to be easy but are really difficult to be understood. Please make me understand the idea in a non-technical way.
2
votes
1
answer
39
views
Is there a difference between r-(the sample correlation coefficient) and rho coefficient?
The two concepts lack a clear meaning.To me, the term - Rho appears to reflect the validity and r -reflect the sample correlation)! - is this understanding valid?
2
votes
0
answers
29
views
Calendar for weekly forecasting
I am forecasting weekly data aggregated from daily. However, weeks don't align year over year, causing a form of phase shift. Additionally, some weeks span years. I read that I can use a 4-4-5 ...
0
votes
0
answers
16
views
Comparing self-selected populations
Suppose I am a free consultant for a single industry of 300 total companies, offering operational consultations. In 2024, I consulted for 100 companies. To expand my practice, I'd like to show that my ...
4
votes
0
answers
46
views
How to detect issues with time series data from multiple related measuring devices?
This is quite a detailed problem I think, so let me provide some context first. I have a quite complex electrical circuit that I am regularly monitoring to make sure it is functioning properly. To do ...
4
votes
0
answers
46
views
Estimating Final Vehicle Counts from Pairwise Marginals Using Python
I am working with vehicle registration data from website
. The website provides counts for various combinations of vehicle attributes such as Maker, RTO, Fuel, Category, SubCategory, and Emission.
...
2
votes
0
answers
42
views
Comparing demographics for hierchichal data
A common ask I get is to compare demographics for 2 businesses. However, the data is nested (hierchichal). Each business has a unique set of locations, and the customer data comes from each location.
...
5
votes
1
answer
107
views
How to evaluate a new reranker in RAG system
I'd like to compare two rerankers in a multi-step chunk retrieval pipeline.
Here's a rough schema:
Query reformulation → semantic search → reranker.
The issue is that the reformulation and semantic ...
0
votes
0
answers
32
views
Repeated Measures Correlation Question
I am currently using repeated measures correlation to calculate the correlation between 2 variables in repeated measures data link to paper
On the paper, equation 4 denotes how repeated measures ...
0
votes
0
answers
33
views
Can I use the slope of a regression to establish a correlation, if r_square is less than 20%?
I fit a regression line between a variable and target value. The coefficient of determination (R_square) between the two is very less < 20%. Does the calculated slope holds any significance in this ...
10
votes
3
answers
1k
views
Get function parameter for functions fitting "between" two other functions
I have two functions $f_1$ and $f_2$ evaluate d (measured) at points $x_i$. They depend on a parameter $p\in [a,b]; a,b \in \mathbb R$ with $f_1=f_1 (x_i;a)$ and $f_2=f_2(x_i;b)$.
The mapping $\Pi:p ...
1
vote
1
answer
325
views
best way to create Synthetic data generation
I want to here y'all opinions on synthetic data generations, which method and tools you use and why.
4
votes
2
answers
400
views
What are the best learning resources for data science and machine learning for a full-stack developer?
To introduce myself: I am a student with full-stack bakcground. I know SQL and managing data using CRUD(Insert, Delete etc.) operations with C#, Python and Java. I have a knowledge of system design ...
5
votes
1
answer
87
views
Changes over time is significant
I am not sure if this is the right place to ask, but I have two fecundity datasets per year. One for males, the other for females:
To give an excerpt of the data:
Gender
year
number born
M
1990
1
M
...
2
votes
0
answers
36
views
Poisson model with repeated measures and endogenous?
I have to analyze if redeeming rewards impacted visitations to our business. However, you earn more rewards the more you visit.
In other words:
More visitations = more rewards
but we want to know if
...