R tips and tricks – Modified Pairs Plot

I have a list of my favorite handy R tricks that I keep in my back pocket (well not literally in my pocket, I’m not that much of a geek) – here’s one of them. Need a fancier pairs plot of continuous variables that shows the actual correlation coefficients and slopes?


From this:
Ugly pairs plot

To this!:
Pretty pairs plot

Enter the functions below, then use pairs.panels(data) on your dataframe of continuous variables.


panel.cor.scale <- function(x, y, digits=2, prefix="", cex.cor)
{
usr <- par("usr"); on.exit(par(usr))
par(usr = c(0, 1, 0, 1))
r = (cor(x, y,use="pairwise"))
txt <- format(c(r, 0.123456789), digits=digits)[1]
txt <- paste(prefix, txt, sep="")
if(missing(cex.cor)) cex <- 0.8/strwidth(txt)
text(0.5, 0.5, txt, cex = cex * abs(r))
}


panel.cor <- function(x, y, digits=2, prefix="", cex.cor)
{
usr <- par("usr"); on.exit(par(usr))
par(usr = c(0, 1, 0, 1))
r = (cor(x, y,use="pairwise"))
txt <- format(c(r, 0.123456789), digits=digits)[1]
txt <- paste(prefix, txt, sep="")
if(missing(cex.cor)) cex <- 0.8/strwidth(txt)
text(0.5, 0.5, txt, cex = cex )
}


panel.hist <- function(x, ...)
{
usr <- par("usr"); on.exit(par(usr))
par(usr = c(usr[1:2], 0, 1.5) )
h <- hist(x, plot = FALSE)
breaks <- h$breaks; nB <- length(breaks)
y <- h$counts; y <- y/max(y)
rect(breaks[-nB], 0, breaks[-1], y, col="cyan", ...)
}


pairs.panels <- function (x,y,smooth=TRUE,scale=FALSE)
{if (smooth ){
if (scale) {
pairs(x,diag.panel=panel.hist,upper.panel=panel.cor.scale,lower.panel=panel.smooth)
}
else {pairs(x,diag.panel=panel.hist,upper.panel=panel.cor,lower.panel=panel.smooth)
} #else {pairs(x,diag.panel=panel.hist,upper.panel=panel.cor,lower.panel=panel.smooth)
}
else #smooth is not true
{ if (scale) {pairs(x,diag.panel=panel.hist,upper.panel=panel.cor.scale)
} else {pairs(x,diag.panel=panel.hist,upper.panel=panel.cor) }
} #end of else (smooth)
} #end of function

8 comments on “R tips and tricks – Modified Pairs Plot

  1. This is a fantastic piece of code. I’m just curious what the nature of the line plotted over the scatterplots is? It’s clearly not linear.

    Thanks for the trick!

  2. Great question, it’s just a default of Loess smoothing that the built-in R function pairs() uses.

  3. This is a great visualization. Thanks for posting. I was wondering if there is any way to add the significance of the correlation to this graph (perhaps the color of the correlation corresponds to the significance level)? Thanks!

  4. Check out the PerformanceAnalytics package.

    library(PerformanceAnalytics)
    chart.Correlation(iris[-5], bg = iris$Species, pch = 21)

  5. Pingback: R tips and tricks – Modified Pairs Plot |...

  6. If you want straight lm lines instead of curved smooth lines on the bottom plots you can do this:

    ## Hello R, I want straight linear regression lines on my scatterplot called panel.lm

    “panel.lm” <-
    function (x, y, pch = par("pch"),
    col.lm = "red", …)
    { ymin <- min(y)
    ymax <- max(y)
    xmin <- min(x)
    xmax <- max(x)
    ylim <- c(min(ymin,xmin),max(ymax,xmax))
    xlim <- ylim
    points(x, y, pch = pch,ylim = ylim, xlim= xlim,…)
    ok <- is.finite(x) & is.finite(y)
    if (any(ok))
    abline(lm(y[ok]~ x[ok]),
    col = col.lm, …)
    }

    ## Next, R please make me a series of pairs plots comparing all the varables, and list correlation vaues:

    panel.cor.scale <- function(x, y, digits=2, prefix="", cex.cor)
    {
    usr <- par("usr"); on.exit(par(usr))
    par(usr = c(0, 1, 0, 1))
    r = (cor(x, y,use="pairwise"))
    txt <- format(c(r, 0.123456789), digits=digits)[1]
    txt <- paste(prefix, txt, sep="")
    if(missing(cex.cor)) cex <- 0.8/strwidth(txt)
    text(0.5, 0.5, txt, cex = cex * abs(r))
    }

    panel.cor <- function(x, y, digits=2, prefix="", cex.cor)
    {
    usr <- par("usr"); on.exit(par(usr))
    par(usr = c(0, 1, 0, 1))
    r = (cor(x, y,use="pairwise"))
    txt <- format(c(r, 0.123456789), digits=digits)[1]
    txt <- paste(prefix, txt, sep="")
    if(missing(cex.cor)) cex <- 0.8/strwidth(txt)
    text(0.5, 0.5, txt, cex = cex )
    }

    panel.hist <- function(x, …)
    {
    usr <- par("usr"); on.exit(par(usr))
    par(usr = c(usr[1:2], 0, 1.5) )
    h <- hist(x, plot = FALSE)
    breaks <- h$breaks; nB <- length(breaks)
    y <- h$counts; y <- y/max(y)
    rect(breaks[-nB], 0, breaks[-1], y, col="cyan", …)
    }

    pairs.panels <- function (x,y,smooth=TRUE,scale=FALSE)
    {if (smooth ){
    if (scale) {
    pairs(x,diag.panel=panel.hist,upper.panel=panel.cor.scale,lower.panel=panel.lm)
    }
    else {pairs(x,diag.panel=panel.hist,upper.panel=panel.cor,lower.panel=panel.lm)
    } #else {pairs(x,diag.panel=panel.hist,upper.panel=panel.cor,lower.panel=panel.lm)
    }
    else #smooth is not true
    { if (scale) {pairs(x,diag.panel=panel.hist,upper.panel=panel.cor.scale)
    } else {pairs(x,diag.panel=panel.hist,upper.panel=panel.cor) }
    } #end of else (smooth)
    } #end of function

    ## R, show me my pairs panel plots:
    pairs.panels(data)

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>