دنبال کننده ها

۱۳۹۶ آبان ۱۲, جمعه

r - Optimising mapply'ied function over large data frame

[ad_1]



I have following created function for my work purposes:



monthsCounter <- function(date1, date2) 
if (date2 < date1)
warning("Can't calculate result if second date is older than first date")
else
date1_Y <- as.numeric(format(date1, '%Y'))
date2_Y <- as.numeric(format(date2, '%Y'))
date1_M <- as.numeric(format(date1, '%m'))
date2_M <- as.numeric(format(date2, '%m'))
if (date2_Y == date1_Y)
date2_M - date1_M
else if (date2_M < date1_M)
max(0, date2_Y - date1_Y - 1)*12 + 12 - date1_M + date2_M
else
max(0, date2_Y - date1_Y)*12 + date2_M - date1_M





In a nutshel it counts months between two dates regardless of month date.
When I mapply it on my data frame:



allData$monthsSinceIssue <- mapply(monthsCounter, allData$start_month, allData$Date)


it takes very long time to calculate.



Question: Do you have any suggestions on how can I optimize my function to make it calculate faster?




[ad_2]

لینک منبع