2
$\begingroup$

I am new to R. While working on my university assignments, I found that legends for Base R plot do not show correct information, hence I switched to ggplot2 wherever legends were needed.

I observed although Base R color code the data (example differentiated by CLASS as was required in our assignment) but legend failed to show right CLASS with respect to color scheme i.e. In graph if Cyan is actually representative CLASS A5 (given the position of points), legend will show something else say Cyan as CLASS A3. There's no way to know it's wrong, until you try same with ggplot2 and find the differences.

Same error never occurs with ggplot2. I have attached both results and code for comparative analysis.

I used below code for Base R:

#A scatter-plot of SHUCK versus VOLUME differentiated by CLASS
plot(y=mydata$SHUCK,x=mydata$VOLUME,main = "SHUCK versus VOLUME (differentiated by CLASS)",col=mydata$CLASS, xlab = 'Volume',ylab = 'Shuck', pch=16)
# Add a legend
legend("topleft", legend=levels(mydata$CLASS), pch=16, col=unique(mydata$CLASS))

enter image description here

If I run similar code using ggplot2, I get legend showing different result. I used below code for ggplot.

x <- ggplot(mydata, aes(VOLUME, SHUCK)) + theme_bw()
x + geom_point(aes(fill = CLASS), shape = 23, alpha = 0.75)

enter image description here

To clarify further, if we check images for Base-R and ggplot with legends, it seems Class A5 in pink for ggplot is represented by Class A3 in cyan for Base R which is wrong

I know I am doing something wrong when I use Base R. How should I add legend in Base R such that legend is in sync with order of color-coded representation in graph to maintain accuracy of representation of actual class of data-points in case of categorical data?

Has anyone experienced same? Any guidance will be helpful. Thanks

$\endgroup$
5
  • $\begingroup$ This is really a very weird error. I have never encountered anything like this. Is this happening time and again when you're re-running the code? $\endgroup$ Commented Nov 29, 2021 at 4:59
  • $\begingroup$ @Shibaprasadb yes $\endgroup$ Commented Nov 29, 2021 at 7:55
  • 1
    $\begingroup$ The color in the base plot is done in order of the data, not the grouping of your data. Try ordering your data on CLASS and create a factor of CLASS before plotting. That should help. Otherwise, add a dput of your data to the question. $\endgroup$ Commented Nov 29, 2021 at 10:54
  • $\begingroup$ @phiver I tried ordering data by CLASS column and got correct legend for plot this time! However, CLASS was already of type FACTOR, I just ordered data. Thanks for the advice, it worked! $\endgroup$ Commented Dec 1, 2021 at 6:58
  • $\begingroup$ The code I added to order the data by column CLASS of type FACTOR: ordered_data <- mydata[order(mydata$CLASS),] $\endgroup$ Commented Dec 1, 2021 at 7:00

0

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.