from_table_x
that holds all of those midpoint values.
|
Script Name | File Names | Brief Description of what the script does |
papfelton() qapfelton() |
apfelton.R | Probability density functions for the Apfelton Distribution
(see web pages)
Examples: source("../apfelton.R") # load the function papfelton(2.34) # returns area to the left of 2.34 papfelton( 1.74, lower.tail=FALSE) # returns area to # the right of 1.74 qapfelton( 0.783 ) # returns x value that has # 0.783 area to its left qapfelton( 0.145, lower.tail=FALSE ) # returns x value # that has 0.145 area to its right |
assess_normality() | assess_normality.R | Produces a plot to help assess if a set of
values is normally distributed.
Example: source("../gnrnd4.R") # be sure gnrnd4() is loaded gnrnd4(1394565804,8300542) # generate some values source("../assess_normality.R") # be sure # ssess_normality() is loaded assess_normality( L1 ) # do the work, create the plot |
pblumenkopf() qblumenkopf() |
blumenkopf.R | Probability density functions for the Blumenkopf Distribution
(see web pages)
Examples: source("../blumenkopf.R") # load the function pblumenkopf(2.34) # returns area to the left of 2.34 pblumenkopf( 1.74, lower.tail=FALSE) # returns area to # the right of 1.74 qblumenkopf( 0.783 ) # returns x value that has # 0.783 area to its left qblumenkopf( 0.145, lower.tail=FALSE ) # returns x value # that has 0.145 area to its right |
ci_2known() | ci_2known.R | Finds the confidence interval for a
the difference of two population means
where we know both population standard deviation,
sigma_1 and sigma_2, and
we have a samples of size n_1 and n_2
with sample means xbar_1 and xbar_2.
Example: # set up the problem # know the population standard deviations sigma_1 <- 11.3 # for 1st population sigma_2 <- 13.8 # for 2nd population # set values for the two samples n_1 <- 57 # size of 1st sample xbar_1 <- 34.76 # mean of 1st sample n_2 <- 41 # size of 2nd sample xbar_2 <- 37.41 # mean of 2nd sample source("../ci_2known.R") # load function # get a 97.5% confidence interval for mean_1 - mean_2 ci_2known( sigma_1, n_1, xbar_1, sigma_2, n_2, xbar_2, 0.975) |
ci_2popproportion() | ci_2popproport.R | Finds the confidence interval for a
the difference of two population proportions
where
we have a samples of size n_1 and n_2
with sample successes called x_1 and x_2.
Example: # set up problem n_1 <- 76 # size of first sample # number of items with characteristic in x_1 <- 37 # first sample n_2 <- 93 # size of second sample # number of items with characteristic in x_2 <- 44 # second sample c_level <- 0.95 # set confience level to 0.95 # be sure function is in environment source("../ci_2popproport.R") # run the function ci_2popproportion( n_1, x_1, n_2, x_2, c_level ) |
ci_2popvar() | ci_2popvar.R | Finds the confidence interval for the ratio
of two population variances
where
we have a samples of size n_1 and n_2
with sample standard deviations called s_1 and s_2.
Example: # set up problem n_1 <- 34 # number of items in numerator sample s_1 <- 12.4 # standard deviation of numerator sample n_2 <- 52 # number of items in denominator sample s_2 <- 11.3 # standard deviation of denominator sample c_level <- 0.90 # set the confidence level # be sure function is in environment source("../ci_2popvar.R") # run the function ci_2popvar( n_1, s_1, n_2, s_2, c_level) |
ci_2unknown() | ci_2unknown.R | Finds the confidence interval for a
the difference of two population means
where we do not know the population standard deviation and
we have a samples of size n_1 and n_2
with sample means xbar_1 and xbar_2,
and sample standard deviations s_1
and s_2.
Results are given for both the simple degrees of freedom
and the computed degrees of freedom.
# set up the problem # do not know the population standard deviations # set values for the two samples n_1 <- 57 # size of 1st sample xbar_1 <- 34.76 # mean of 1st sample s_1 <- 11.3 # standard deviation of 1st sample n_2 <- 41 # size of 2nd sample xbar_2 <- 37.41 # mean of 2nd samples s_2 <- 13.8 # standard deviation of 2nd sample source("../ci_2unknown.R") # load function # get a 97.5% confidence interval for mean_1 - mean_2 ci_2unknown( s_1, n_1, xbar_1, s_2, n_2, xbar_2, 0.975) |
ci_known() | ci_known.R | Finds the confidence interval for a popultion mean
where we know the population standard deviation
sigma and we have a sample of size n_1
with a sample mean xbar_1.
Example: # set up problem sigma <- 4.56 # known standard dev. of population n_1 <- 34 # sample size xbar_1 <- 28.4 # sample mean c_level <-0.85 # confidence level set to 0.85 # be sure that the function is loaded source("../ci_known.R") # run the fnction ci_known( sigma, n_1, xbar_1, c_level ) |
ci_prop() | ci_prop.R | Computes a confidence interval for the
proportion given the sample size, the
number of items with the characteristic,
and the confidence level.
Example: # set up problem n_1 <- 65 # size of sample # number of items in sample with x_1 <- 24 # the desired characteristic c_level <- 0.95 # set confidence level to be 0.95 # be sure function is loaded source("../ci_prop.R") # run the function ci_prop( n_1, x_1, c_level ) |
ci_stddev() | ci_stddev.R | Finds the confidence interval for a population
standard deviation based on a sample size,
sample standard deviation, and confidence level
desired.
Example: # set up problem n_1 <- 26 # size of the sample s_1 <- 5.34 # set standard dev. of sample c_level <- 0.925 # set confidence level to 0.925 # make sure function is loaded source("../ci_stddev.R") # run function ci_stddev( n_1, s_1, c_level ) |
ci_unknown() | ci_unknown.R | Finds the confidence interval for a population mean
where we do not know the population standard deviation
but we do have a sample of size n_1
with a sample mean xbar_1 and
a sample standard deviation s_1.
Example: # set up problem s_1 <- 4.56 # standard dev. of sample n_1 <- 34 # sample size xbar_1 <- 28.4 # sample mean c_level <-0.85 # confidence level set to 0.85 # be sure that the function is loaded source("../ci_unknown.R") # run the fnction ci_unknown( s_1, n_1, xbar_1, c_level ) |
collate3() | collate3.R | This script produces a frequency table for non-discrete data.
The script is often run twice, first with just the data specified and second,
based upon the output of the first run, with the data, the low value of the first
"bucket" and the bucket width.
The name is left over from a program developed on the TI83/84
calculators to do the same thing.
Example: # set up problem # first generate some values source("../gnrnd4.R") gnrnd4(2768424504,142513276) # look at the data L1 # be sure the function is loaded source("../collate3.R") # run it the first time collate3( L1 ) # the output of that will give us # a good idea of the low value for the # first "bucket", and the width of # the buckets. Here we will use 105 # and 5 respectively low_val <- 105 bucket_width <- 5 collate3( L1, use_low=low_val, use_width=bucket_width) # # by default these are closed on the # right, we will run again, closed on # the left and we will store and then # view the result holder <- collate3( L1, use_low=low_val, use_width=bucket_width, right=FALSE) View( holder ) # note the capital V |
num_comb() nCr() num_perm() nPr() |
combinations.R | Functions to do combinations of n things taken r at a time.
(Note this includes the functions for permutations as well.)
Example: # make sure the functions are loaded source("../combinations.R") # that actually loads four functions # num_comb(), nCr(), num_perm(), and # nPr() # run each for 8 things taken 3 at a time num_comb( 8, 3 ) nCr( 8, 3 ) num_perm( 8, 3 ) nPr( 8, 3 ) |
crosstab() | crosstab.R | Function to provide not only the
cross tabulation for a matrix but also to provide
the expected values, the row, column, and total percents, and the
intermediate steps to perform a χ² test for independence
on the original matrix.
Example: # set up the problem # first we will generate a crosstab matrix source("../gnrnd4.R") gnrnd4( key1=783566808, key2=8756454753 ) # then look at it matrix_A # now be sure the function is loaded source("../crosstab.R") # run the function crosstab( matrix_A ) # this produces a little consol output and # numerous tabs in the editor pane |
dot_plot() | dot_plot.R | Produces a dot plot of the data.
Example: # set up the problem # first we will generate a list of values source("../gnrnd4.R") gnrnd4( 507093402, 1200148 ) # then look at it L1 # now be sure the function is loaded source("../dot_plot.R") # run the function dot_plot( L1 ) # this produces a dot plot of the values |
find_percentile() | find_percentile.R |
This function, based on a given list of values and a goal value,
computes the percentile for that goal value in the given list.
Example: # set up the problem # Assuming L1 holds a list of numeric values goal_val <- 348 # we want to know the percentile for # 348 in that listed # now be sure the function is loaded source("../find_percentile.R") # run the function find_percentile( L1, goal_val ) |
find_samp_size() | findsampsize.R |
For confidence intervals for the population mean,
this function finds the required sample size
for a desired margin of error value
given the population standard deviation
and the confidence level.
Example: # set up the problem sigma <- 13.24 # the population std. dev. c_level <- 0.95 # set confidence level at 0.95 m_o_e <- 2.75 # the desired margin of error # now be sure the function is loaded source("../findsampsize.R") # run the function find_samp_size( sigma, c_level, m_o_e ) |
get_from_table() | get_from_table.R | This script takes the low and high values for
the first interval of a interval-based frequency table, along
with a list of the frequencies for the intervals and produces
the sum of the frequencies, the approximate mean of the data,
and the standard deviation of the data based on
using the midpoint values of each interval the given frequency
number of times. The script also produces a new variable
from_table_x that holds all of those midpoint values.
Example: # be sure the function is loaded source("../get_from_table.R") # run the program. We need to give the # program the low and high values of # the first interval of the frequency tabe, # followed by the list of interval frequencies. # The command below assumes that the first # interval is from 19 to 36 and that there are # six intervals with the frequencies 23, 28, 15, # 32, 19, and 27 get_from_table( 19, 36, c(23, 28, 15, 32, 19, 27) ) # along with the output of the funtion we get # a new variable that holds the midpoints repeated # the number of times given by the frequencies from_table_x # then look at the list of midpoints |
gnrnd4() | gnrnd4.R | This script generates data in the variable L1,
and possibly in L2, depending upon two, or three, keys that are
supplied as arguments to the function. It is typical to have
these arguments supplied on test questions so that you can
create tables of data that are identical to those given on the test.
The use of the function is described in
gnrnd4.htm.
Example: # be sure the function is loaded source("../gnrnd4.R") # run the program, the various key # values are usually given on some # web page, but we could read the # documentation to understand them gnrnd4( 1607653804, 11300762) L1 # then look at the result |
gnrnd5() | gnrnd5.R | This script generates data in the variable L1,
and possibly in L2, or in one case, in matrix_A
depending upon two, or three, keys that are
supplied as arguments to the function. It is typical to have
these arguments supplied on test questions so that you can
create tables of data that are identical to those given on the test.
The use of the function is described in
gnrnd5.htm.
Example: # be sure the function is loaded source("../gnrnd5.R") # run the program, the various key # values are usually given on some # web page, but we could read the # documentation to understand them gnrnd5( 160765003804, 113000762) L1 # then look at the result |
goodfit() | goodfit.R |
For problems where we mave multiple categories and we have a
null hypothesis stating the probability (proportion) for each
category, and we have the frequency of each category in some sample,
this will do a χ² test on the goodness of fit
for that sample against the null hypothesis.
Example: # set up the problem sig_level <- 0.05 # the significance level of the test cat_names <- 1:5 # the names of the categories, here 1, 2, 3, 4, 5 H_0 <- c(3, 5, 3, 7, 2)/20 # theproportions in the null hypothesis obs <- c(11, 38, 25, 36, 10) # the observed frequencies auto_view <- TRUE # forces the function to display a fancy table # now be sure the function is loaded source("../goodfit.R") # run the function goodfit( cat_names, H_0, obs, sig_level, auto_view ) |
hypoth_2test_known() | hypo_2known.R | This script performs a test on
H0: &mu1; - μ2 = 0
against one of the standard alterntive hypotheses based on
two samples, of size n_1 and n_2,
that give us a sample means xbar_1 and xbar_2,
and where we know the standard deviations, sigma_1 and sigma_2
of the underlying populations.
We also need to specify the desired level of significance.
Example: # set up problem sigma_1 <- 4.32 # population 1 std. dev. sigma_2 <- 5.71 # population 2 std. dev. n_1 <- 45 # sample 1 size xbar_1 <- 56.2 # sample 1 mean n_2 <- 58 # sample 2 size xbar_2 <- 57.9 # sample 2 mean # set up the type of test H_type <- -1 # neg for <, 0 for !=, pos for > alpha <- 0.05 # level of significance # make sure function is loaded source("../hypo_2known.R") # run function hypoth_2test_known( sigma_1, n_1, xbar_1, sigma_2, n_2, xbar_2, H_type, alpha ) |
hypoth_2test_prop() | hypo_2popproport.R | This script performs a test on
H0: p1; - p2 = 0
against one of the standard alterntive hypotheses based on
two samples, of size n_1 and n_2,
that give us a sample success counts x_1 and x_2.
We also need to specify the desired level of significance.
Example: # set up problem # number of items in sample 1 x_1 <- 39 # with characteristic n_1 <- 83 # size of sample 1 # number of items in sample 2 x_2 <- 44 # with characteristic n_2 <- 125 # size of sample 2 H_type <- 0 # neg for <, 0 for !=, pos for > alpha <- 0.05 # set significance level # be sure function is loaded source("../hypo_2popproport.R") # run function hypoth_2test_prop( x_1, n_1, x_2, n_2, H_type, alpha ) |
hypoth_2test_unknown() | hypo_2unknown.R | This script performs a test on
H0: μ1 - μ2 = 0
against one of the standard alterntive hypotheses based on
two samples, of size n_1 and n_2,
that give us a sample means xbar_1 and xbar_2,
and the sample standard deviations s_1 and s_2.
We also need to specify the desired level of significance.
Results are given for both the simple degrees of freedom
and the computed degrees of freedom.
Example: # set up problem n_1 <- 45 # sample 1 size xbar_1 <- 56.2 # sample 1 mean s_1 <- 4.32 # sample 1 std. dev. n_2 <- 58 # sample 2 size xbar_2 <- 57.9 # sample 2 mean s_2 <- 5.71 # sample 2 std. dev. # set up the type of test H_type <- -1 # neg for <, 0 for !=, pos for > alpha <- 0.05 # level of significance # make sure function is loaded source("../hypo_2unknown.R") # run function hypoth_2test_unknown( s_1, n_1, xbar_1, s_2, n_2, xbar_2, H_type, alpha ) |
hypoth_2test_var() | hypo_2var.R | This script performs a test on the equality
of two population variances. The populations need to be normal.
Arguments for the function include the sample sizes, n_top and n_bot,
and the two sample standard deviations, s_top and s_bot.
The type of the alternative hypothesis and the
level of significance are also arguments.
Example: # set up problem n_top <- 41 # size of sample in numerator s_top <- 7.8 # std. dev. of numerator sample n_bot <- 53 # size of sample in denominator s_bot <- 6.4 # std. dev. of denominator sample H_type <- -1 # neg for <, 0 for !=, pos for > alpha <- 0.05 # level of significance # make sure function is loaded source("../hypo_2var.R") # run function hypoth_2test_var( n_top, s_top, n_bot, s_bot, H_type, alpha ) |
hypoth_test_known() | hypo_known.R | This script performs a test on H0: μ = a
against one of the standard alterntive hypotheses based on a sample of size n_1
that gives us a sample mean, xbar_1, and where we know the standard deviation,
sigma_1,
of the underlying population. We also need to specify the desired level of significance.
Example: # set up problem sigma_1 <- 11.43 # population std. dev. n_1 <- 35 # sample size xbar_1 <- 74.6 # sample mean H0 <- 77.9 # null hypothesis value H_type <- -1 # neg for <, 0 for !=, pos for > alpha <- 0.05 # level of significance # make sure function is loaded source("../hypo_known.R") # run function hypoth_test_known( H0, sigma_1, H_type, alpha, n_1, xbar_1) |
hypoth_test_prop() | hypo_prop.R | This script performs a test on
H0: p = a
against one of the standard alterntive hypotheses
based on a sample of size n_1
that gives us a count, x_1, of the items in the sample
that display the characteristic of interest.
We also need to specify the desired level of significance.
Example: # set up problem n_1 <- 72 # sample size # number of items in the sample x_1 <- 28 # with the characteristic H0 <- 0.3 # null hypothesis value H_type <- 1 # neg for <, 0 for !=, pos for > alpha <- 0.05 # level of significance # make sure function is loaded source("../hypo_prop.R") # run function hypoth_test_prop( H0, x_1, n_1, H_type, alpha ) |
hypoth_test_sigma() | hypo_sigma.R | This script performs a test on
H0: σ = a
against one of the standard alterntive hypotheses based on a
sample of size n_1
that gives us a sample standard deviation, s_1.
We also need to specify the desired level of significance.
The test should only be used if you are quite sure that the underlying
population has a normal distribution.
Example: # set up problem n_1 <- 72 # sample size s_1 <- 2.8 # sample standard deviation H0 <- 3.0 # null hypothesis value H_type <- -1 # neg for <, 0 for !=, pos for > alpha <- 0.05 # level of significance # make sure function is loaded source("../hypo_sigma.R") # run function hypoth_test_sigma( H0, n_1, s_1, H_type, alpha ) |
hypoth_test_unknown() | hypo_unknown.R | This script performs a test on H0: μ = a
against one of the standard alterntive hypotheses based on a sample of size n_1
that gives us a sample mean, xbar_1, and where we do not know the standard deviation
of the underlying population,
but we do know the sample standard deviation, s_1. We also need to specify the desired level of significance.
Example: # set up problem s_1 <- 11.43 # sample std. dev. n_1 <- 35 # sample size xbar_1 <- 74.6 # sample mean H0 <- 77.9 # null hypothesis value H_type <- -1 # neg for <, 0 for !=, pos for > alpha <- 0.05 # level of significance # make sure function is loaded source("../hypo_unknown.R") # run function hypoth_test_unknown( H0, H_type, alpha, n_1, xbar_1, s_1 ) |
long_summary() | long_summary.R | This is an extension of the R function summary.
This function gives all of the basic information but it augments that with
the Q1 and Q3 values
that would be computed by the TI-83/84 calculator. Furthermore, the long_summary
function provides the size of the data list, the sum of the x's, the sum of the x² 's,
the mean, the population standard deviation (sigma), and the sample standard deviation.
Example: # set up problem L1 <- c(6, 7, 15, 36, 39, 40, 41, 42, 43, 47, 49) source("../long_summary.R") long_summary(L1) |
make_freq_table() | make_freq_table.R | This script create a frequency table for given
discrete data.
Example: # set up problem # generate some values source("../gnrnd4.R") gnrnd4( 480933203,800025) L1 # look at those value # make sure function is loaded source("../make_freq_table.R") # run the function make_freq_table( L1 ) |
Mode() | mode.R | Find the mode value or values in a set of values.
Note that the function name starts with an upper case M. The
function "mode() is defined in R but it has a different use.
Example: # set up problem # generate some values source("../gnrnd4.R") gnrnd4( 490833203,800025) L1 # look at those value # make sure function is loaded source("../mode.R") # note lower case m # run the function Mode( L1 ) # note upper case M |
model | model.R | This is not a function at all. This is a dummy file that I gave to students on their USB drive but which some of them have managed to delete. It is listed here so that students can save it to their USB dirive if need be. |
pbinomeq() | pbinomeq.R | This is a small function to find the
probability of exactly one number, k, of successes
in n trials with a probability of success given as p.
Example: # set up problem n <- 14 # number of trials k <- 5 # number of successes p <- 0.32 # probability of success # make sure the functions are loaded source("../pbinomeq.R") # run the function pbinomeq( k, n, p ) |
num_perm() nPr() |
permutations.R | Functions to do permutations of n things taken r at a time.
Example: # make sure the functions are loaded source("../permutations.R") # that actually loads two functions # num_perm() and nPr() # run both for 8 things taken 3 at a time num_perm( 8, 3 ) nPr( 8, 3 ) |
pop_sd() | pop_sd.R | Computes the population standard devation given the
raw data.
Example: # set up problem # generate some values source("../gnrnd4.R") gnrnd4( 492833204,800035) L1 # look at those value # make sure function is loaded source("../pop_sd.R") # run the function pop_sd( L1 ) |
pprop() | pprop.R | Computes the normal approximation for the
probability of getting phat (or less) from a proportion with probability of success
p and sample size n.
Example: # set up problem # generate some values phat <- 0.37 # proportion in sample p <- 0.40 # proportion known in population n <- 32 # sample size # make sure function is loaded source("../pprop.R") # run the function # find prob of a sample # of size n with proportion # phat or less, if population pprop( phat, p, n ) # has proportion=p # same, but find probability # of phat or more pprop( phat, p, n, lower.tail=FALSE) |
starter.R | This file just contains comments. It is meant to be downloaded and then used as a starting point for an RStudio session. | |
shuffle() | shuffle.R | This script creates a shuffled version of the list that is
provided as the argument.
Example: # set up problem L1 <- 1:100 # create a list of values 1 to 100 L1 # look at those value # make sure function is loaded source("../shuffle.R") # run the function shuffle( L1 ) # note that this does not change L1 L1 # to shuffle a list and have the list retain that shuffle L1 <- 1:100 L1 L1 <- shuffle( L1 ) # this changes the list L1 L1 |
stem_leaf() | stem_leaf.R | This script creates a stem and leaf
diagram of the data where the user specifie the data and the position of the
cut between the stem and the leaf values.
Example: # set up problem # generate some values source("../gnrnd4.R") gnrnd4( 492833204,600031) L1 # look at those value # make sure function is loaded source("../stem_leaf.R") # run the function stem_leaf( L1 ) |