For both correlation and regression we assume that we are dealing with data that is given as pairs of values, an X value and a corresponding Y value. The table below identifies such pairs of data, and it gives an index value to each pair. With the assigned index value we can talk about the fifth pair and see that the fifth pair is X=45.5 and Y=30.6. The following table gives the pairs of values that we will use for this page.
| As usual, we run the GNRND4 program, giving it the correct key values, in order to generate the data on our calculator. The X list will be stored in L1. The Y list will be stored in L1. |
|
The program displays the two lists of values and terminates. We go to the Stat menu, via , and then press to paste the SetUpEditor command here. Then press to have the calculator perform the command. |
|
With this much data we might feel more comfortable knowing that the numbers in our calculator lists match the values in the table above. Use to open the StatEditor as shown in Figure 3. We compare these values to the first seven pairs of X and Y values in the table. Then we can move down the list to see and check the remaining values. |
|
Figrue 4 shows the final pairs of values. These correspond to the final values in the atable above. |
|
Now confident that we have the right values, we move to the STAT menu, and then to teh CALC sub-menu using . There we use the key to move the highlight down to the fourth item, LinReg(ax+b). At that point we can press to paste that command onto our main screen. |
|
Once pasted here, we press to get the calculator to perform the task. |
|
Here is the result. The equation of the line that best fits the
data is given as
If the last two lines of output are not showing on your calculator it is because you do not have Diagnostics turned On. |
|
We want to have the calculator graph the linear regression equation. To do this we use to move to the Y= screen. Once we are there, we could just type in the equation. However, we might as well use the features of the calculator and let it paste the equation here. We just need to find where the calculator gives us access to that equation. |
|
Use to open the VARS menu, and then use to move the highlight to the Statistics option. Press to open the Statistics sub-menu. |
|
At the Statistics sub-menu, use to move to the EQ sub-menu. |
|
In Figure 11 we see that the first option, RegEQ, is the one we want. Press to use that option. |
|
The calculator then pastes the entire regression equation into the Y= screen. |
|
Just to be sure things are set up the way we want them, we press
to see the
STAT PLOT menu. Here we note that Plot1 is On, that it is set to
do a scatter plot, and that the scatter plot will be based on the
values in L1
and L2. This is just what we want.
If any of those settings were incorrect then we would have had to open the Plot1 menu and to make the required changes. |
|
Now that everything is set we move to the ZOOM menu via the
key. From long experience we know that we want to use
option 9, ZoomStat even though that option is not visible on the current screen.
Nonetheless, we press and the calculator
continues with teh ZoomStat option. This causes the calculator to look at the data for the plot, to set the WINDOW values appropriately, and then to move to do the actual graph. |
|
In Figure 15 we see the plot of the 30 pairs of values from the original table along with the
graph of the linear regresion equation.
Unlike the example in the earlier web page on this topic, the data points plotted here are not quite so close to the graphed line. We should have expected this based on the much lower value for the correlation coefficient in this example. Remeber that the closer the correlation coefficient is to 1 or –1 the smaller the spread of plotted points from the regression line. Our correlation coefficient in this example, |
|
We are interested in using the regression equation to find
expected values. One way to do this is to
move into TRACE mode via the key.
Note that in TRACE mode the calculator first moves to trace the plotted points. Furthermoe, the calculator "traces" those points in the order in which they were given in the original data lists. Thus, the first point in the list, X=88.1 and Y=73.5 is the highlighted point. It is hard to see in a still image such as Figure 16, but the highlight is on that point, the one in the upper right corner. A comparison to the image in Figure 15 will demonstrate the slight difference present when this image was captured. |
|
Press to move the trace to the graph of the regresson equation. The calculator choose a point on the line midway across the screen as its initial trace point on the line. althogu we did not ask for it, this shows that when X=63.05 then the expected value of Y is 48.943909. |
|
We can force the calculator to highlight a specific point on the line by giving the calcualtor the desired X value. In Figure 18 we have entered 71.7. Once we press the key the calculator will move to that X value, position the blinking cursor on the graph of the regression equation, and give us the Y value at the bottom of the screen. |
|
Figure 19 shows us that 54.4247 is the
expected value when X=71.7.
Of course, looking at the original data values in the table above, we know that X=71.7 is one of those values. It is in index position number 12 in the list. The corresponding Y value is 49. On the graph this is the plotted point directly below the highlight. Thus, The observed value of X=71.7 is 49 and the expected value of X=71.7 is 54.4247. The difference, (observed) – (expected) is called the residual value at X=71.7. Using our numbers, the residual value at X=71.7 is 49 – 54.4247 = –5.4247. |
|
In Figure 20 we have entered a new X value, 78, and
pressed to get the calculator to "trace" that
value. The image in Figure 20 has been altered to place an arrow
showing the location of the highlighted point. From this screen
we know that 58.41649 is the expected value for X=78.
There is no observed value for X=78 because none of the
points in the original table had an X value of 78.
The safest way to leave Figure 20 is to use the key sequence. |
|
Using the original table and the information on Figure 19 we were able to determine the residual value for the one point where X=71.7. When the calculator produces the linear regression equation, via the LinReg(ax+b) command in Figure 6, it also produces a corresponding list of residual values called RESID. We want to look at the values that have been placed in that list. To do this we go to the LIST menu by using the keys. Here, in Figure 21, the calculator shows a menu of all the list names that exist on this calculator. We will use the key to go looking for and eventually highlighting the list we want, RESID. |
|
We found the desired RESID. It is the highlighted name. Press to paste that name onto the main screen. |
|
Once the name RESID has been pasted here, press
to complete the command. This will
make a copy of RESID in the list L3.
We do this so that when we re-open the editor the residual valeus
will be shown without our having to change the lists that appear in the
editor.
To perform the command we pressed . The calculator displayes the assigned values of the new list, L3, but we would have a hard time following that list were we to scroll across it. |
|
Instead we use to open the StatEditor. Now we see, in each row of the display, the associated X, Y, and residual values. We can scroll down and up on the list to see all 30 of the calcualted residual values. |
|
In Figure 25 we have scrolled down, and then actually back up, to
display and highlight the 12th
row of values, the
one that contains the X=71.7 and Y=49 pair.
The final column shows that the associated residual
is –5.425, a more rounded
version of the –5.4247 we
computed in Figure 19. Which value should we use? That depends upon the accuracy of the original measures and the demands of our assignment. In this case, looking back at the original data, the Y values are only given to one decimal place. We really should not give an expected value that is derived from the table values any more accuracy than we had in the table. Therefore, the most appropriate expression of the expected value, when X=71.7 would be Y=54.4. Had we used that expected, value in determining our residual value, we would have the residual value be –5.4, and we would get that answer from just rounding either of the two values we have found so far. Having commented on the appropriate rounding for the residual values, it is clear that the calculator has no problem giving answers that imply an accuracy far more impressive than deserved. |
|
We can saw that the more extensive, underserved, accuracy back in Figure 23 when the calculator showed the start of the display of the valeus assigned to L3, the same values that were stored in the list RESID. We can look at a specific item in that list. We could look at either L3(12) or RESID(12) to see value stored in the 12th position of the list. In Figure 26 we formed and then executed the latter command to see the full residual value. |
|
There is one more thing to examine here, namely, we want to get a scatter plot of the
the X values and the residual values.
To do this we use to return to the STAT PLOT menu. We want to change the settings of Plot1. Plot1 is the highlighted plot so we just press to move to that screen. |
|
In Figure 28 we have the Plot1 settings and we have moved the highlight down to the Ylist: field. The highlight appears as the character indicating that the calculator is willing to accept letters here. |
|
We enter RESID as . [An alternative method to get the RESID name here would be to open the LIST menu, scroll down to find the RESID name there, and press the ENTER key at that point. Either method will work.] Once we have entered the name, we move to the next screen by pressing the key. |
|
The change has been made. We are almost ready to get the graph. Before we rush to do that we realize that although the X values in the graph will be the same as those that we have been using above, the Y values are now the residual values, and those are quite different from the Y values in L2 from the original table. Therefore, we should do a ZoomStat again to have the calculator reset the WINDOW values appropriately. |
|
opens the ZOOM menu. chooses and performs the 9th option, ZoomStat, even though it is not on the display here. |
|
The graph of the X values and the associated residual values is shown in Figure 32. We note that the residuals are scattered, there is no real clumping of values, there is no pattern of these values, and there are no really extreme values. This is good. We can feel fairly confident of the appropriateness of having done a linear regression on the origianl data. |
©Roger M. Palay
Saline, MI 48176
September, 2012