Many lines that are added to plots are just straight lines that span the plot.
abline()
is a good choice for this type of line. Say that we wished
to add a vertical line at 2.5 on the x axis to the plot to divide the women who
completed high school from those who didn't.
> abline(v=2.5,col=3,lty=3)
This would produce a green, dotted, vertical line across the plot. To divide the other axis, say that age 33 was to be marked.
> abline(h=33,col=4,lty=2)
would draw a blue, dashed, horizontal line at 33 on the y axis. We can also display regression lines.
> abline(lm(infert$age~as.numeric(infert$educ)),col=2,lty=1)
This draws a solid, red line illustrating the regression of education on age.
curve()
to
add it to your plot. Using the airquality
data, plot
airquality$Ozone
. Suppose you think that the probability of a given
concentration of ozone on any day is described by two linear functions, one
valid for the range 0 to 120, and the other for 120 and up.
> data(airquality) > airhist<-hist(airquality$Ozone) > curve(40-(x/3.3+1),from=0,to=120,add=T) > curve(6.6-(x/30),from=120,to=180,add=T)
This might impress an uncritical audience, but it is completely fabricated. When you are at a loss for what the underlying distribution might be, it may be better to just smooth the data and plot the result.
> airhist<-hist(airquality$Ozone) > airspline<-spline(airhist$counts) > lines(rescale(airspline$x,range(airhist$mids)),airspline$y)
There are a number of smoothing algorithms available in R, including
spline()
. Producing smoothed curves for histogram()
or barplot()
is a common problem, partly because the horizontal
axis on these plots is not scaled in an obvious way. As you can see,
histogram()
returns a list that contains the midpoints of the bars,
as does barplot()
. The function rescale() does a simple linear transformation of one
vector of values into a new scale. In this case, the scaling was by about a
factor of 20.
For more information, see An Introduction to R: Examining the distribution of a set of data.