1
$\begingroup$

i have a dataframe with 34154695 obs. in a dataset a Class variable with value 0 indicate "not purchased" and 1 indicate "purchase".

> str(data)
'data.frame':   34154695 obs. of  5 variables:
 $ SessionID: int  1 1 1 2 2 2 2 2 2 3 ...
 $ Timestamp: Factor w/ 34069144 levels "2014-04-01T03:00:00.124Z",..: 1452469 1452684 1453402 1501801 1501943 1502207 1502429 1502569 1502932 295601 ...
 $ ItemID   : int  214536500 214536506 214577561 214662742 214662742 214825110 214757390 214757407 214551617 214716935 ...
 $ Category : Factor w/ 339 levels "0","1","10","11",..: 1 1 1 1 1 1 1 1 1 1 ...
 $ Class    : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...

I am facing difficulties finding a solution to plot a histogram of the number of purchase per week, per day and time wise purchase based on class value = 1 and wanna output like this. enter image description here enter image description here

Could someone please inform how I should proceed?? Really, thank you for any help and suggestings.

Kind regards

$\endgroup$

1 Answer 1

1
$\begingroup$

I think first of all, if you want to get aggregated data, you need to group it by day/week, like this:

library(dplyr)
library(lubridate)



x <- strptime(data$Timestamp, format = "%Y-%m-%d")#assume you need only days/month , assign to a variable, because dplyr has problems with with date type.

data$month <- month(x) #get month from date obj.

month_summ <- data %>% group_by(month) %>%  #group by month and calculated sold items per month
  summarise(
    total_sales = n()
  )


library(ggplot2)

ggplot(data=month_summ, aes(x=month, y=total_sales)) +   geom_bar(stat="identity") #plot the histogram

This should do the work for you or act like a starting point. Here is a good reference for ggplot2 bar plots

Hope this helps!

$\endgroup$
12
  • $\begingroup$ your code give this error Error in grouped_df_impl(data, unname(vars), drop) : Column date is of unsupported class POSIXlt/POSIXt $\endgroup$ Commented Dec 21, 2018 at 14:33
  • $\begingroup$ @mairakhan edited the post - can you please try it with your data? $\endgroup$ Commented Dec 21, 2018 at 14:56
  • $\begingroup$ ok sir ....... :) $\endgroup$ Commented Dec 21, 2018 at 15:37
  • $\begingroup$ got error again ...... Error in grouped_df_impl(data, unname(vars), drop) : Column date is of unsupported class POSIXlt/POSIXt $\endgroup$ Commented Dec 21, 2018 at 15:43
  • $\begingroup$ @mairakhan You sure that you didnt include formated time into your DataFrame? $\endgroup$ Commented Dec 22, 2018 at 9:22

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.