The task here is to come up with descriptive measures for the data in Table 1.

This page assumes that you have read through earlier pages and that you have mastered the steps that we use to set up our work. To that end we will assume that we have

- inserted our USB drive,
- created a directory called
on that drive`worksheet034`

- have copied
from our root folder into our new folder,`model.R`

- have renamed that new copy of the file to the name
, and`ws34.R`

- have double clicked on that file to open
**RStudio**.

Much of the discussion that would be given here has been included in the comments in our

`ws34.R`

The console view of those commands:

The we add the commands to generate many of our descriptive statistics.

In Figure 5 we can see all of those values.

Now we want to generate the frequency table. We are given specific direction on the start of the first

`ft`

`View(ft)`

In Figure 7 we can read the frequency table values in the

Figure 8 shows the same values, but in a more fancy display.

As stated in the comment shown in Figure 9, we really could just re-enter the five values for the frequencies and the five vales for the midpoints. However, since they are already in the system, after all we did see them in the table, we can go look for them. The command

`str(ft)`

`ft`

In Figure 10 we note that the

`ft$Freq`

`ft$midpnt`

We use those two names just to prove to ourselves that they hold the values that we need.

A run of those commands gives the output shown in Figure 12.

That is good enough for us so we can use the two variables to generate our desired sum.

Figure 14 shows the resulting computed value. It checks with the value given in the problem so we feel confident that we have done the right thing.

In Figure 15 we form the command to generate a

Running that command does not provide much in the

However, in the

The

`hist(L1)`

That

All that is left to do is to save our

`ws24.R`

`q()`

`y`

Here is a listing of the complete contents of the

`ws34.R`

#This is for worksheet034 # first load some functions source("../gnrnd4.R") source("../pop_sd.R") source("../collate3.R") # then generate and look at the data for the table. gnrnd4( key1=1142946902, key2=44004840 ) L1 # then get the usual descriptive measures summary( L1 ) xbar <- mean( L1 ) xbar sd(L1) pop_sd(L1) # now build and then display the frequency table # Note that we know, from the summary function, the lowest # value to be 484, and the problem asks that we start # the first interval at the highest multiple of 10 # less than or equal to that. So, our first interval # starts at 480. ft <- collate3( L1, 480, 10, right=FALSE) ft View(ft) # The problem gives us the fact that the sum of the products # of the frequencies and the midpoints needs to be 35000. # We should compute this just to be sure we have the right # breakdown of the data. There are not many values so we # could just type them into the system. However, we know # that all the values are in the variable ft. Look at # the structure of ft str( ft ) # This means that the frequencies are in ft$Freq and the # midpoints are in ft$midpnt. Just to be sure we will # look at those two lists ft$Freq ft$midpnt # Then the sum we want to find is sum(ft$Freq*ft$midpnt) # And, while we are at it, we might as well generate # a box and whisker plot of the original data boxplot(L1, horizontal=TRUE) # Then too, we can do a histogram of the original data hist( L1 )

©Roger M. Palay Saline, MI 48176 January, 2017