Module 9: Lecture Notes for Math 170

Return to Lecture Notes page.
Some images on this page have been generated via AsciiMathML.js.
For more information see: www.chapman.edu/~jipsen/asciimath.html.

  1. This topic is included in the course because the occurrence of statistics within programming applications is strangely high. At the same time, it is not common to find built-in functions that produce these values.
  2. Finding the mode involves finding the item, or items, in a list that occur the most often.
  3. Finding the median, and indeed, the first and third quartile points, and percentiles, requires that we sort the list of data. [Three notes on this. First, the strange definition for the median of an even number of items requires a bit of extra programming. Second, there is not a universally accepted rule for determining the first and third quartile points. Third, there is not a universally accepted rule for determining percentiles. ]
  4. Finding the mean involves adding up all the values in the list.
  5. The standard deviation (which is the square root of the variance) gives a feeling for the spread of the data away from the mean.
    1. Chebyshev's Inequality: at least `(1-1/(K^2))` of the values in a data list will be within `K` standard deviations from the mean.
    2. Normal Distribution: 68% within 1 standard deviation of the mean, 95% within 2, 99% within 3.
    3. Definition and full name: root mean squared deviation from the mean.
      for a populations this is: `sigma=sqrt((Sigma_(i=1)^N (x_i-mu)^2)/N)`
      for a sample this is `s_x=sqrt((Sigma_(i=1)^n (x_i-barx)^2)/(n-1))`
    4. For each there is a more computationally friendly equivalent formula
      for a population: `sigma=sqrt((Sigma(x^2) - (Sigmax)^2/N)/N)`
      for a sample: `s_x=sqrt((Sigma(x^2) - (Sigmax)^2/n)/(n-1))`
  6. A linear regression is the equation, generally in the form `y=ax+b`, for a straight line that "best fits" the data made up of coordinate pairs of values `(x_i,y_i)`.
    1. Correlation Coefficient: a computed value that gives a measure of the "goodness of fit" between the linear regression and the data. Values for the correlation coefficient are always between `1` and `-1`. Values close to either `1` or `-1` indicate good fits, values closer to `0` indicate less of a fit.
    2. Observed values
    3. Expected values (those predicted by the regression equation)
    4. Residuals: (Observed–Expected) values
    5. Interpolation
    6. Extrapolation

Return to Lecture Notes page.

©Roger M. Palay
Saline, MI 48176
November, 2013