Skip to main content

Questions tagged [exploratory-data-analysis]

EDA stands for "Exploratory data analysis". Developed by Tukey to contrast with Confirmatory Data Analysis or CDA (the formal testing of hypotheses). EDA is typically concerned with describing data numerically and graphically to make the data easier to understand and to yield new insights.

0 votes
0 answers
26 views

Its 2025, and yes I'm still using SAS EMiner's Decision Tree..... If anyone knows a modern freeware version that replicates the Interactive mode effectively (with controlling split cutoff values, a ...
Anthony Galka's user avatar
0 votes
1 answer
51 views

I come from a machine learning background, however I am trying to learn more traditional data science. I have a dataset of vehicles and the target is the Breakdown Likelihood (1 to 3, 1 being lowest), ...
92carmnad's user avatar
3 votes
1 answer
32 views

I'm working on a classification problem where the goal is to maximize the F1-score, hopefully above 80%. Despite a very thorough EDA and preprocessing workflow, I've hit a hard performance ceiling ...
hijunyng's user avatar
4 votes
5 answers
708 views

It’s confusing to understand how quartile values can actually be used to give insights into a dataset. Please assist with examples. I struggle to interpret the values in the context of providing ...
Buchi's user avatar
  • 41
0 votes
0 answers
36 views

So, I have a general question regarding PCA. As far as I understand, before performing PCA you are supposed to perform a correlation analysis between the features so that redundant features can be ...
Sunera Wijeratne's user avatar
1 vote
1 answer
78 views

I have done an Exploratory Factory Analysis. I want fit mesures of the model. I am on JASP and Jamovi. I need Goodness-of-Fit Index (GFI), Ajusted GFI (AGFI) and Normed Fit Index (NFI). I tried SEM ...
Dididamdumdum's user avatar
2 votes
1 answer
178 views

I am currently conducting a factor analysis on a scale that theoretically consists of five factors. However, both Principal Component Analysis (PCA) and Maximum Likelihood (ML) extraction methods in ...
hakan karatepe's user avatar
0 votes
0 answers
41 views

I have a dataset of 28 personality assessment features, which measures personality attributes like Diligence or Sociability to determine performance in the corporate workplace. I'm tasked with ...
Michael Tran's user avatar
0 votes
0 answers
89 views

For ratio scale data it is relatively simple to create and visualize a correlation matrix e.g. as shown below. Ho can I do the same for a data frame that contains also nominal scale data? I would like ...
Tamas's user avatar
  • 185
14 votes
3 answers
794 views

I have a fairly basic question about analyzing a dataset of measurements taken on a number of fish, which I’m doing as part of a student project. So I have measurements of four species of fish of ...
roland222x's user avatar
8 votes
1 answer
541 views

I am doing a regression analysis of environmental data, and I encounter some rather specific relationships between my predictors and the response variable. I am doubtful that a simple linear ...
Olejjio's user avatar
  • 81
0 votes
0 answers
99 views

I'm trying to identify the relationship between the dependent variable and the independent variables. I've utilized linear regression, but I'm not sure if it's suitable given the distribution of my ...
Chemokine1's user avatar
1 vote
0 answers
59 views

Lets say we are given a time series sample and want to try to create a model to forecast future values of said time series When trying to build a model to forecast time series data, many statistics ...
QMath's user avatar
  • 461
3 votes
1 answer
152 views

I'm currently looking at three specific questions of a feedback survey and have been tasked with finding out the characteristics of the lowest scorers, to see if there are any patterns or common ...
sixfortyseven's user avatar
0 votes
0 answers
104 views

I'm trying to carry out my first EDA on a Student Performance dataset. The dataset has 395 samples and consists of 33 attributes. After drawing the boxplots and doing some tests I detected outliers in ...
Christina Kataki's user avatar

15 30 50 per page
1
2 3 4 5
23