Calculate the Area under a Curve

Programing

Calculate the Area under a Curve

lottogame 2020. 12. 24. 23:20

Calculate the Area under a Curve

I would like to calculate the area under a curve to do integration without defining a function such as in integrate().

My data looks as this:

Date          Strike     Volatility
2003-01-01    20         0.2
2003-01-01    30         0.3
2003-01-01    40         0.4
etc.

I plotted plot(strike, volatility) to look at the volatility smile. Is there a way to integrate this plotted "curve"?

The AUC is approximated pretty easily by looking at a lot of trapezium figures, each time bound between x_i, x_{i+1}, y{i+1} and y_i. Using the rollmean of the zoo package, you can do:

library(zoo)

x <- 1:10
y <- 3*x+25
id <- order(x)

AUC <- sum(diff(x[id])*rollmean(y[id],2))

Make sure you order the x values, or your outcome won't make sense. If you have negative values somewhere along the y axis, you'd have to figure out how exactly you want to define the area under the curve, and adjust accordingly (e.g. using abs() )

Regarding your follow-up : if you don't have a formal function, how would you plot it? So if you only have values, the only thing you can approximate is a definite integral. Even if you have the function in R, you can only calculate definite integrals using integrate(). Plotting the formal function is only possible if you can also define it.

Just add the following to your program and you will get the area under the curve:

require(pracma)
AUC = trapz(strike,volatility)

From ?trapz:

This approach matches exactly the approximation for integrating the function using the trapezoidal rule with basepoints x.

Three more options, including one using a spline method and one using Simpson's rule...

# get data
n <- 100
mean <- 50
sd <- 50

x <- seq(20, 80, length=n)
y <- dnorm(x, mean, sd) *100

# using sintegral in Bolstad2
require(Bolstad2)
sintegral(x,y)$int

# using auc in MESS
require(MESS)
auc(x,y, type = 'spline')

# using integrate.xy in sfsmisc
require(sfsmisc)
integrate.xy(x,y)

The trapezoidal method is less accurate than the spline method, so MESS::auc (uses spline method) or Bolstad2::sintegral (uses Simpson's rule) should probably be preferred. DIY versions of these (and an additional approach using the quadrature rule) are here: http://www.r-bloggers.com/one-dimensional-integrals/

OK so I arrive a bit late at the party but going over the answers a plain R solution to the problem is missing. Here goes, simple and clean:

sum(diff(x) * (head(y,-1)+tail(y,-1)))/2

The solution for OP then reads as:

sum(diff(strike) * (head(volatility,-1)+tail(volatility,-1)))/2

This effectively calculates the area using the trapezoidal method by taking the average of the "left" and "right" y-values.

NB: as @Joris already pointed out you could use abs(y) if that would make more sense.

In the pharmacokinetics (PK) world, calculating different types of AUC is a common and fundamental task. The are lots of different AUC calculations for pharmacokietics, such as

AUC0-t = AUC from zero to time t
AUC0-last = AUC from zero to the last time point (may be same as above)
AUC0-inf = AUC from zero to time infinity
AUCint = AUC over a time interval
AUCall = AUC over the whole time period for which data exists

One of the best packages which does these calculations is the relatively new package PKNCA from the folks at Pfizer. Check it out.

Joris Meys's answer was great but I struggled to remove NAs from my samples. Here is the little function I wrote to deal with them :

library(zoo) #for the rollmean function

######
#' Calculate the Area Under Curve of y~x
#'
#'@param y Your y values (measures ?)
#'@param x Your x values (time ?)
#'@param start : The first x value 
#'@param stop : The last x value
#'@param na.stop : returns NA if one value is NA
#'@param ex.na.stop : returns NA if the first or the last value is NA
#'
#'@examples 
#'myX = 1:5
#'myY = c(17, 25, NA, 35, 56)
#'auc(myY, myX)
#'auc(myY, myX, na.stop=TRUE)
#'myY = c(17, 25, 28, 35, NA)
#'auc(myY, myX, ex.na.stop=FALSE)
auc = function(y, x, start=first(x), stop=last(x), na.stop=FALSE, ex.na.stop=TRUE){
  if(all(is.na(y))) return(NA)
  bounds = which(x==start):which(x==stop)
  x=x[bounds]
  y=y[bounds]
  r = which(is.na(y))
  if(length(r)>0){
    if(na.stop==TRUE) return(NA)
    if(ex.na.stop==TRUE & (is.na(first(y)) | is.na(last(y)))) return(NA)
    if(is.na(last(y))) warning("Last value is NA, so this AUC is bad and you should feel bad", call. = FALSE) 
    if(is.na(first(y))) warning("First value is NA, so this AUC is bad and you should feel bad", call. = FALSE) 
    x = x[-r]
    y = y[-r]
  }
  sum(diff(x[order(x)])*rollmean(y[order(x)],2))
}

I then use it with an apply onto my dataframe : myDF$auc = apply(myDF, MARGIN=1, FUN=auc, x=c(0,5,10,15,20))

Hope it can help noobs like me :-)

EDIT : added bounds

You can use ROCR package, where the following lines will give you the AUC:

pred <- prediction(classifier.labels, actual.labs)
attributes(performance(pred, 'auc'))$y.values[[1]]

ReferenceURL : https://stackoverflow.com/questions/4954507/calculate-the-area-under-a-curve

'Programing' 카테고리의 다른 글

CLI pdf viewer for linux (0)	2020.12.24
How do I Alter Table Column datatype on more than 1 column? (0)	2020.12.24
SQL Server RODBC Connection (0)	2020.12.24
Maven plugins can not be found in IntelliJ (0)	2020.12.24
How can I expand a child div to 100% screen width if the container div is smaller? (0)	2020.12.24

현재글Calculate the Area under a Curve

복권의 역사, 로또 정보와 IT 기술 등을 다루는 블로그입니다.

극장순위, Spring3, 자바, 공연, java, 연극, c++, 뮤지컬, 가족나들이, 행사, Javascript, JQuery, c#, 볼거리, 관광, 여행, 축제, spring, 놀거리, 무비순위,

Today :
Yesterday :

일	월	화	수	목	금	토
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

lottogame

Calculate the Area under a Curve

Calculate the Area under a Curve

'Programing' 카테고리의 다른 글

'Programing'의 다른글

티스토리툴바

Calculate the Area under a Curve

Calculate the Area under a Curve

'Programing' 카테고리의 다른 글

'Programing'의 다른글

관련글

티스토리툴바