Questions tagged [data-analysis]
Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.
284 questions
3
votes
1
answer
121
views
The term "validity coefficent" is frequently used in books on Statistics and on meta- analysis (Hedges and Olkin) . Please elaborate the concept
I am confused with terms: validity coefficient and reliability in terms of statistics and mathematics.Usually,psychometric data is generated while executing empirical studies in psychology,education ...
9
votes
3
answers
521
views
Product distribution between a set of machines
I'm working on a problem, that deals with product distribution between a set of machines in a factory. Here's what I'm given (variables):
The amount of products produced daily, e.g. 19200
The amount ...
1
vote
1
answer
44
views
Expand or explode data in this case?
I am a student working on a data analysis project, I have to plot the duration of treatment per infectious site (boxplot).
My infectious site variable can contain more than an infectious site (example ...
3
votes
1
answer
163
views
Tips for pivoting into a Data Analyst role from a different career background
I’m currently exploring a career pivot into data analytics. My background is in experience design and textile technology and I’ve been building skills in SQL, Excel and more recently R
I’d love to ...
4
votes
1
answer
218
views
Suggestion for data analysis with meteorological data
not sure if this is the right place or not to ask for advise about my issue, if not sorry you can close this post.
I have a project at university where I have to analyse a dataset with meteorological ...
7
votes
1
answer
122
views
Wind Power Data Analysis - Python
I am seeking some help and or perspectives in solving a problem.
I have a dataset (accessible here) with the following columns:
DATE: this is the date in dd/mm/yyyy format
HH: this is the "half-...
1
vote
0
answers
39
views
SHAP vs. Manual Analysis: Why Opposite Correlations for a feature?
When plotting a SHAP beeswarm plot on my binary classification model (predicting subscription renewal probability), one of the columns indicate that high feature values correlate with low SHAP values ...
1
vote
0
answers
34
views
Advice Needed: Statistical Analysis for Non-Normal Data with Unequal Sample Sizes
I’m working on a dataset using python where I’m analyzing the impact of smoking status, age, and sex on response times (also height and weight). The data is highly skewed and not normally distributed, ...
2
votes
1
answer
76
views
How to become data analyst
What is differences between data science and data analytics and what are the key skills required to become good data analyst
1
vote
0
answers
85
views
Why do many phenomena exhibit logarithmic-like growth patterns?
I've noticed that in various contexts—such as ratings, lifetime view counts, and reputation on online platforms—things often grow in a way that resembles a logarithmic curve. They experience rapid ...
0
votes
1
answer
47
views
Clarification of the method in topological data analysis
I want to detect the stock price crashes using topological data analysis. For example I have taken an excel file named tesla with columns date,open,high,low,close and volume. I want the time series of ...
1
vote
1
answer
43
views
getting ideas to start my learning
i have plan to start my career on data analytics and i need a guildline how to start and where to start ,if you are ready to give some hints through that I'll get some clarity and i'll start my ...
0
votes
1
answer
66
views
How to Classify Blueberries as "Crunchy", "Juicy" or "Soft" using Acoustic Signal Processing and Machine Learning?
I'm working on a project to classify blueberries based on their texture—specifically, whether they are soft, juicy, or crunchy—using the sounds they produce when crushed. I have about 1100 audio ...
-1
votes
1
answer
36
views
machine learning - data science - data analysis
I have a research project in the machine learning area. In this study, dataset contains more than 4000 numbers categorized in four columns. I am going to find or predict a possible relation between ...
0
votes
0
answers
49
views
Is it okay to ignore missing data despite a very small percentage?
My dataset has seven cols and 63 mil rows. There are two columns with missing data. The first one [Terminal Number] has 400k rows of missing data but thankfully, I can extract values from another ...