Links to R scripts

Return to Main page

Here is a list, in alphabetic order, of the scripts developed for the Math 160 class. Each script name is a link to a table entry that gives more information about the script. The following table summarizes the scripts that I have developed for the class. The items are in alphabetic order by file name, which may be different from the function name. The table gives the script name, the a link to a file that holds the script, and a brief description of the script. For almost all of the descriptions there is an example giving R statements that demonstrate the use of the function. For the most part there is one function per file. However, there are instances where there is more than one script in a file.

Important note: The source( ) statements included in the detailed descriptions below assume that the sample script is being run from a directory (folder) that is a sub-directory (sub-folder) that is a "child" of the directory (folder) holding the scripts being explained.

Script NameFile Names Brief Description of what the script does
     
papfelton()
qapfelton()
apfelton.R Probability density functions for the Apfelton Distribution (see web pages)
Examples:
source("../apfelton.R")  # load the function
papfelton(2.34) # returns area to the left of 2.34
papfelton( 1.74, lower.tail=FALSE) # returns area to
#                                  the right of 1.74
qapfelton( 0.783 ) # returns x value that has
#                   0.783 area to its left
qapfelton( 0.145, lower.tail=FALSE ) # returns x value
#                     that has 0.145 area to its right
assess_normality() assess_normality.R Produces a plot to help assess if a set of values is normally distributed.
Example:
source("../gnrnd4.R")   # be sure gnrnd4() is loaded
gnrnd4(1394565804,8300542)  # generate some values
source("../assess_normality.R") # be sure 
#          ssess_normality() is loaded
assess_normality( L1 ) # do the work, create the plot
pblumenkopf()
qblumenkopf()
blumenkopf.R Probability density functions for the Blumenkopf Distribution (see web pages)
Examples:
source("../blumenkopf.R")  # load the function
pblumenkopf(2.34) # returns area to the left of 2.34
pblumenkopf( 1.74, lower.tail=FALSE) # returns area to
#                                  the right of 1.74
qblumenkopf( 0.783 ) # returns x value that has
#                   0.783 area to its left
qblumenkopf( 0.145, lower.tail=FALSE ) # returns x value
#                     that has 0.145 area to its right
ci_2known() ci_2known.R Finds the confidence interval for a the difference of two population means where we know both population standard deviation, sigma_1 and sigma_2, and we have a samples of size n_1 and n_2 with sample means xbar_1 and xbar_2.
Example:
# set up the problem
# know the population standard deviations
sigma_1 <- 11.3  # for 1st population
sigma_2 <- 13.8  # for 2nd population
# set values for the two samples
n_1 <- 57        # size of 1st sample
xbar_1 <- 34.76  # mean of 1st sample
n_2 <- 41        # size of 2nd sample
xbar_2 <- 37.41  # mean of 2nd sample
source("../ci_2known.R")  # load function
# get a 97.5% confidence interval for mean_1 - mean_2
ci_2known( sigma_1, n_1, xbar_1,
           sigma_2, n_2, xbar_2, 0.975)
ci_2popproportion() ci_2popproport.R Finds the confidence interval for a the difference of two population proportions where we have a samples of size n_1 and n_2 with sample successes called x_1 and x_2.
Example:
#            set up problem
n_1 <- 76  # size of first sample
           # number of items with characteristic in
x_1 <- 37  # first sample
n_2 <- 93  # size of second sample
           # number of items with characteristic in
x_2 <- 44  # second sample
c_level <- 0.95 # set confience level to 0.95
#            be sure function is in environment
source("../ci_2popproport.R")
#            run the function
ci_2popproportion( n_1, x_1, n_2, x_2, c_level )
ci_2popvar() ci_2popvar.R Finds the confidence interval for the ratio of two population variances where we have a samples of size n_1 and n_2 with sample standard deviations called s_1 and s_2.
Example:
#           set up problem
n_1 <- 34    # number of items in numerator sample
s_1 <- 12.4  # standard deviation of numerator sample
n_2 <- 52    # number of items in denominator sample
s_2 <- 11.3  # standard deviation of denominator sample
c_level <- 0.90 # set the confidence level
#            be sure function is in environment
source("../ci_2popvar.R")
#            run the function
ci_2popvar( n_1, s_1, n_2, s_2, c_level)
ci_2unknown() ci_2unknown.R Finds the confidence interval for a the difference of two population means where we do not know the population standard deviation and we have a samples of size n_1 and n_2 with sample means xbar_1 and xbar_2, and sample standard deviations s_1 and s_2. Results are given for both the simple degrees of freedom and the computed degrees of freedom.
# set up the problem
# do not know the population standard deviations
# set values for the two samples
n_1 <- 57        # size of 1st sample
xbar_1 <- 34.76  # mean of 1st sample
s_1 <- 11.3      # standard deviation of 1st sample
n_2 <- 41        # size of 2nd sample
xbar_2 <- 37.41  # mean of 2nd samples
s_2 <- 13.8      # standard deviation of 2nd sample
source("../ci_2unknown.R")  # load function
# get a 97.5% confidence interval for mean_1 - mean_2
ci_2unknown( s_1, n_1, xbar_1,
             s_2, n_2, xbar_2, 0.975)
ci_known() ci_known.R Finds the confidence interval for a popultion mean where we know the population standard deviation sigma and we have a sample of size n_1 with a sample mean xbar_1.
Example:
#           set up problem
sigma <- 4.56   # known standard dev. of population
n_1 <- 34       # sample size
xbar_1 <- 28.4  # sample mean
c_level <-0.85  # confidence level set to 0.85
#                 be sure that the function is loaded
source("../ci_known.R")
#               run the fnction
ci_known( sigma, n_1, xbar_1, c_level )
ci_prop() ci_prop.R Computes a confidence interval for the proportion given the sample size, the number of items with the characteristic, and the confidence level.
Example:
#             set up problem
n_1 <- 65       # size of sample
                # number of items in sample with 
x_1 <- 24       #    the desired characteristic
c_level <- 0.95 # set confidence level to be 0.95
#             be sure function is loaded
source("../ci_prop.R")
#             run the function 
ci_prop( n_1, x_1, c_level )
ci_stddev() ci_stddev.R Finds the confidence interval for a population standard deviation based on a sample size, sample standard deviation, and confidence level desired.
Example:
#            set up problem
n_1 <- 26      # size of the sample
s_1 <- 5.34    # set standard dev. of sample
c_level <- 0.925  # set confidence level to 0.925
#              make sure function is loaded
source("../ci_stddev.R")
#              run function
ci_stddev( n_1, s_1, c_level )
ci_unknown() ci_unknown.R Finds the confidence interval for a population mean where we do not know the population standard deviation but we do have a sample of size n_1 with a sample mean xbar_1 and a sample standard deviation s_1.
Example:
#           set up problem
s_1 <- 4.56     # standard dev. of sample
n_1 <- 34       # sample size
xbar_1 <- 28.4  # sample mean
c_level <-0.85  # confidence level set to 0.85
#                 be sure that the function is loaded
source("../ci_unknown.R")
#               run the fnction
ci_unknown( s_1, n_1, xbar_1, c_level )
collate3() collate3.R This script produces a frequency table for non-discrete data. The script is often run twice, first with just the data specified and second, based upon the output of the first run, with the data, the low value of the first "bucket" and the bucket width. The name is left over from a program developed on the TI83/84 calculators to do the same thing.
Example:
#           set up problem
#           first generate some values
source("../gnrnd4.R")
gnrnd4(2768424504,142513276)
#           look at the data
L1
#           be sure the function is loaded
source("../collate3.R")
#            run it the first time
collate3( L1 )
#            the output of that will give us
#           a good idea of the low value for the
#           first "bucket", and the width of
#           the buckets.  Here we will use 105
#           and 5 respectively
low_val <- 105
bucket_width <- 5
collate3( L1, use_low=low_val, 
          use_width=bucket_width)
#
#         by default these are closed on the
#         right, we will run again, closed on
#         the left and we will store and then 
#         view the result
holder <- collate3( L1, use_low=low_val, 
          use_width=bucket_width,
          right=FALSE)
View( holder )  #  note the capital V
num_comb()
nCr()
num_perm()
nPr()
combinations.R Functions to do combinations of n things taken r at a time. (Note this includes the functions for permutations as well.)
Example:
#        make sure the functions are loaded
source("../combinations.R")
#        that actually loads four functions
#        num_comb(), nCr(), num_perm(), and
#        nPr()
#      run each for 8 things taken 3 at a time
num_comb( 8, 3 )
nCr( 8, 3 )
num_perm( 8, 3 )
nPr( 8, 3 )
crosstab() crosstab.R Function to provide not only the cross tabulation for a matrix but also to provide the expected values, the row, column, and total percents, and the intermediate steps to perform a χ² test for independence on the original matrix.
Example:
#       set up the problem
#       first we will generate a crosstab matrix
source("../gnrnd4.R")
gnrnd4( key1=783566808, key2=8756454753 )
#       then look at it
matrix_A
#       now be sure the function is loaded
source("../crosstab.R")
#       run the function
crosstab( matrix_A )
#       this produces a little consol output and
#       numerous tabs in the editor pane
dot_plot() dot_plot.R Produces a dot plot of the data.
Example:
#       set up the problem
#       first we will generate a list of values
source("../gnrnd4.R")
gnrnd4( 507093402, 1200148 )
#       then look at it
L1
#       now be sure the function is loaded
source("../dot_plot.R")
#       run the function
dot_plot( L1 )
#       this produces a dot plot of the values
find_percentile() find_percentile.R This function, based on a given list of values and a goal value, computes the percentile for that goal value in the given list.
Example:
#       set up the problem
#   Assuming L1 holds a list of numeric values 
goal_val <- 348   # we want to know the percentile for 
                  # 348 in that listed
#       now be sure the function is loaded
source("../find_percentile.R")
#       run the function
find_percentile( L1, goal_val )
find_samp_size() findsampsize.R For confidence intervals for the population mean, this function finds the required sample size for a desired margin of error value given the population standard deviation and the confidence level.
Example:
#       set up the problem
sigma <- 13.24    # the population std. dev. 
c_level <- 0.95   # set confidence level at 0.95
m_o_e <- 2.75     # the desired margin of error
#       now be sure the function is loaded
source("../findsampsize.R")
#       run the function
find_samp_size( sigma, c_level, m_o_e )
get_from_table() get_from_table.R This script takes the low and high values for the first interval of a interval-based frequency table, along with a list of the frequencies for the intervals and produces the sum of the frequencies, the approximate mean of the data, and the standard deviation of the data based on using the midpoint values of each interval the given frequency number of times. The script also produces a new variable from_table_x that holds all of those midpoint values.
Example:
#           be sure the function is loaded
source("../get_from_table.R")
#           run the program. We need to give the
#           program the low and high values of
#           the first interval of the frequency tabe,
#           followed by the list of interval frequencies.
#           The command below assumes that the first
#           interval is from 19 to 36 and that there are
#           six intervals with the frequencies 23, 28, 15, 
#           32, 19, and 27
get_from_table( 19, 36, c(23, 28, 15, 32, 19, 27) )
#      along with the output of the funtion we get
#      a new variable that holds the midpoints repeated
#      the number of times given by the frequencies
from_table_x          #  then look at the list of midpoints
gnrnd4() gnrnd4.R This script generates data in the variable L1, and possibly in L2, depending upon two, or three, keys that are supplied as arguments to the function. It is typical to have these arguments supplied on test questions so that you can create tables of data that are identical to those given on the test. The use of the function is described in gnrnd4.htm.
Example:
#           be sure the function is loaded
source("../gnrnd4.R")
#           run the program, the various key 
#           values are usually given on some
#           web page, but we could read the 
#           documentation to understand them 
gnrnd4( 1607653804, 11300762)
L1          #  then look at the result
gnrnd5() gnrnd5.R This script generates data in the variable L1, and possibly in L2, or in one case, in matrix_A depending upon two, or three, keys that are supplied as arguments to the function. It is typical to have these arguments supplied on test questions so that you can create tables of data that are identical to those given on the test. The use of the function is described in gnrnd5.htm.
Example:
#           be sure the function is loaded
source("../gnrnd5.R")
#           run the program, the various key 
#           values are usually given on some
#           web page, but we could read the 
#           documentation to understand them 
gnrnd5( 160765003804, 113000762)
L1          #  then look at the result
goodfit() goodfit.R For problems where we mave multiple categories and we have a null hypothesis stating the probability (proportion) for each category, and we have the frequency of each category in some sample, this will do a χ² test on the goodness of fit for that sample against the null hypothesis.
Example:
#       set up the problem
sig_level <- 0.05      # the significance level of the test
cat_names <- 1:5        # the names of the categories, here 1, 2, 3, 4, 5
H_0 <- c(3, 5, 3, 7, 2)/20  # theproportions in the null hypothesis
obs <- c(11, 38, 25, 36, 10) # the observed frequencies
auto_view <- TRUE       # forces the function to display a fancy table
#       now be sure the function is loaded
source("../goodfit.R")
#       run the function
goodfit( cat_names, H_0, obs, sig_level, auto_view )
hypoth_2test_known() hypo_2known.R This script performs a test on H0: &mu1; - μ2 = 0 against one of the standard alterntive hypotheses based on two samples, of size n_1 and n_2, that give us a sample means xbar_1 and xbar_2, and where we know the standard deviations, sigma_1 and sigma_2 of the underlying populations. We also need to specify the desired level of significance.
Example:
#           set up problem
sigma_1 <- 4.32  # population 1 std. dev.
sigma_2 <- 5.71  # population 2 std. dev.
n_1 <- 45        # sample 1 size
xbar_1 <- 56.2   # sample 1 mean
n_2 <- 58        # sample 2 size
xbar_2 <- 57.9   # sample 2 mean
#                  set up the type of test
H_type <- -1 # neg for <, 0 for !=, pos for >
alpha <- 0.05    # level of significance
#                make sure function is loaded
source("../hypo_2known.R")
#                run function
hypoth_2test_known( sigma_1, n_1, xbar_1,
                    sigma_2, n_2, xbar_2,
                    H_type, alpha )
hypoth_2test_prop() hypo_2popproport.R This script performs a test on H0: p1; - p2 = 0 against one of the standard alterntive hypotheses based on two samples, of size n_1 and n_2, that give us a sample success counts x_1 and x_2. We also need to specify the desired level of significance.
Example:
#           set up problem
            # number of items in sample 1
x_1 <- 39   #      with characteristic
n_1 <- 83   # size of sample 1
            # number of items in sample 2
x_2 <- 44   #      with characteristic
n_2 <- 125  # size of sample 2
H_type <- 0 # neg for <, 0 for !=, pos for > 
alpha <- 0.05  # set significance level
#            be sure function is loaded
source("../hypo_2popproport.R")
#             run function
hypoth_2test_prop( x_1, n_1, x_2, n_2,
                   H_type, alpha )
hypoth_2test_unknown() hypo_2unknown.R This script performs a test on H0: μ1 - μ2 = 0 against one of the standard alterntive hypotheses based on two samples, of size n_1 and n_2, that give us a sample means xbar_1 and xbar_2, and the sample standard deviations s_1 and s_2. We also need to specify the desired level of significance. Results are given for both the simple degrees of freedom and the computed degrees of freedom.
Example:
#           set up problem
n_1 <- 45        # sample 1 size
xbar_1 <- 56.2   # sample 1 mean
s_1 <- 4.32      # sample 1 std. dev.
n_2 <- 58        # sample 2 size
xbar_2 <- 57.9   # sample 2 mean
s_2 <- 5.71      # sample 2 std. dev.
#                  set up the type of test
H_type <- -1 # neg for <, 0 for !=, pos for >  
alpha <- 0.05    # level of significance
#                make sure function is loaded
source("../hypo_2unknown.R")
#                run function
hypoth_2test_unknown( s_1, n_1, xbar_1,
                    s_2, n_2, xbar_2,
                    H_type, alpha )
hypoth_2test_var() hypo_2var.R This script performs a test on the equality of two population variances. The populations need to be normal. Arguments for the function include the sample sizes, n_top and n_bot, and the two sample standard deviations, s_top and s_bot. The type of the alternative hypothesis and the level of significance are also arguments.
Example:
#           set up problem
n_top <- 41   # size of sample in numerator
s_top <- 7.8  # std. dev. of numerator sample
n_bot <- 53   # size of sample in denominator
s_bot <- 6.4  # std. dev. of denominator sample
H_type <- -1 # neg for <, 0 for !=, pos for >  
alpha <- 0.05    # level of significance
#                make sure function is loaded
source("../hypo_2var.R")
#                run function
hypoth_2test_var( n_top, s_top, n_bot, 
                  s_bot, H_type, alpha )
hypoth_test_known() hypo_known.R This script performs a test on H0: μ = a against one of the standard alterntive hypotheses based on a sample of size n_1 that gives us a sample mean, xbar_1, and where we know the standard deviation, sigma_1, of the underlying population. We also need to specify the desired level of significance.
Example:
#           set up problem
sigma_1 <- 11.43  # population std. dev.
n_1 <- 35         # sample size
xbar_1 <- 74.6    # sample mean
H0 <- 77.9        # null hypothesis value
H_type <- -1 # neg for <, 0 for !=, pos for >  
alpha <- 0.05    # level of significance
#                make sure function is loaded
source("../hypo_known.R")
#                run function
hypoth_test_known( H0, sigma_1, H_type,
                   alpha, n_1, xbar_1)
hypoth_test_prop() hypo_prop.R This script performs a test on H0: p = a against one of the standard alterntive hypotheses based on a sample of size n_1 that gives us a count, x_1, of the items in the sample that display the characteristic of interest. We also need to specify the desired level of significance.
Example:
#          set up problem
n_1 <- 72   # sample size
            # number of items in the sample 
x_1 <- 28   #    with the characteristic
H0 <- 0.3        # null hypothesis value
H_type <-  1 # neg for <, 0 for !=, pos for >  
alpha <- 0.05    # level of significance
#                make sure function is loaded
source("../hypo_prop.R")
#                run function
hypoth_test_prop( H0, x_1, n_1, H_type,
                  alpha )
hypoth_test_sigma() hypo_sigma.R This script performs a test on H0: σ = a against one of the standard alterntive hypotheses based on a sample of size n_1 that gives us a sample standard deviation, s_1. We also need to specify the desired level of significance. The test should only be used if you are quite sure that the underlying population has a normal distribution.
Example:
#          set up problem
n_1 <- 72   # sample size
s_1 <- 2.8  # sample standard deviation
H0 <- 3.0        # null hypothesis value
H_type <- -1 # neg for <, 0 for !=, pos for >  
alpha <- 0.05    # level of significance
#                make sure function is loaded
source("../hypo_sigma.R")
#                run function
hypoth_test_sigma( H0, n_1, s_1, H_type,
                  alpha )
hypoth_test_unknown() hypo_unknown.R This script performs a test on H0: μ = a against one of the standard alterntive hypotheses based on a sample of size n_1 that gives us a sample mean, xbar_1, and where we do not know the standard deviation of the underlying population, but we do know the sample standard deviation, s_1. We also need to specify the desired level of significance.
Example:
#           set up problem
s_1 <- 11.43      # sample std. dev.
n_1 <- 35         # sample size
xbar_1 <- 74.6    # sample mean
H0 <- 77.9        # null hypothesis value
H_type <- -1 # neg for <, 0 for !=, pos for >  
alpha <- 0.05    # level of significance
#                make sure function is loaded
source("../hypo_unknown.R")
#                run function
hypoth_test_unknown( H0, H_type, alpha,
                   n_1, xbar_1, s_1 )
long_summary() long_summary.R This is an extension of the R function summary. This function gives all of the basic information but it augments that with the Q1 and Q3 values that would be computed by the TI-83/84 calculator. Furthermore, the long_summary function provides the size of the data list, the sum of the x's, the sum of the x² 's, the mean, the population standard deviation (sigma), and the sample standard deviation.
Example:
#           set up problem
L1 <- c(6, 7, 15, 36, 39, 40, 41, 42, 43, 47, 49)
source("../long_summary.R")
long_summary(L1)

make_freq_table() make_freq_table.R This script create a frequency table for given discrete data.
Example:
#           set up problem
#           generate some values
source("../gnrnd4.R")
gnrnd4( 480933203,800025)
L1        # look at those value
#           make sure function is loaded
source("../make_freq_table.R")
#           run the function
make_freq_table( L1 )
Mode() mode.R Find the mode value or values in a set of values. Note that the function name starts with an upper case M. The function "mode() is defined in R but it has a different use.
Example:
#           set up problem
#           generate some values
source("../gnrnd4.R")
gnrnd4( 490833203,800025)
L1        # look at those value
#           make sure function is loaded
source("../mode.R")   # note lower case m
#           run the function
Mode( L1 )   # note upper case M
model model.R This is not a function at all. This is a dummy file that I gave to students on their USB drive but which some of them have managed to delete. It is listed here so that students can save it to their USB dirive if need be.
pbinomeq() pbinomeq.R This is a small function to find the probability of exactly one number, k, of successes in n trials with a probability of success given as p.
Example:
#           set up problem
n <- 14     # number of trials
k <-  5     # number of successes
p <- 0.32   # probability of success
#        make sure the functions are loaded
source("../pbinomeq.R")
#        run the function
pbinomeq( k, n, p )
num_perm()
nPr()
permutations.R Functions to do permutations of n things taken r at a time.
Example:
#        make sure the functions are loaded
source("../permutations.R")
#        that actually loads two functions
#         num_perm() and nPr()
#      run both for 8 things taken 3 at a time
num_perm( 8, 3 )
nPr( 8, 3 )
pop_sd() pop_sd.R Computes the population standard devation given the raw data.
Example:
#           set up problem
#           generate some values
source("../gnrnd4.R")
gnrnd4( 492833204,800035)
L1        # look at those value
#           make sure function is loaded
source("../pop_sd.R")  
#           run the function
pop_sd( L1 )
pprop() pprop.R Computes the normal approximation for the probability of getting phat (or less) from a proportion with probability of success p and sample size n.
Example:
#           set up problem
#           generate some values
phat <- 0.37  # proportion in sample
p    <- 0.40  # proportion known in population
n    <- 32    # sample size
#           make sure function is loaded
source("../pprop.R")  
#           run the function
#                   find prob of a sample
#                   of size n with proportion
#                   phat or less, if population
pprop( phat, p, n )  # has proportion=p
#                   same, but find probability
#                   of phat or more
pprop( phat, p, n, lower.tail=FALSE)
  starter.R This file just contains comments. It is meant to be downloaded and then used as a starting point for an RStudio session.
shuffle() shuffle.R This script creates a shuffled version of the list that is provided as the argument.
Example:
#           set up problem
L1 <- 1:100 # create a list of values 1 to 100
L1        # look at those value
#           make sure function is loaded
source("../shuffle.R")
#           run the function
shuffle( L1 )   # note that this does not change L1
L1
#   to shuffle a list and have the list retain that shuffle
L1 <- 1:100
L1
L1 <- shuffle( L1 ) # this changes the list L1
L1
stem_leaf() stem_leaf.R This script creates a stem and leaf diagram of the data where the user specifie the data and the position of the cut between the stem and the leaf values.
Example:
#           set up problem
#           generate some values
source("../gnrnd4.R")
gnrnd4( 492833204,600031)
L1        # look at those value
#           make sure function is loaded
source("../stem_leaf.R")  
#           run the function
stem_leaf( L1 )
Return to Main page

©Roger M. Palay     Saline, MI 48176     November, 2017