Correlation and Linear Regression, Take 2

This page is presents the keystrokes and the screen images for generating a Correlation and Linear Regression on a TI-83 (TI-83 Plus, or TI-84 Plus) calculator. After presenting the problem, we will do the analysis. This page assumes some familiarity with the calculator.

For both correlation and regression we assume that we are dealing with data that is given as pairs of values, an X value and a corresponding Y value. The table below identifies such pairs of data, and it gives an index value to each pair. With the assigned index value we can talk about the fifth pair and see that the fifth pair is X=45.5 and Y=30.6. The following table gives the pairs of values that we will use for this page.

Figure 1 As usual, we run the GNRND4 program, giving it the correct key values, in order to generate the data on our calculator. The X list will be stored in L₁. The Y list will be stored in L₁.

Figure 2 The program displays the two lists of values and terminates. We go to the Stat menu, via , and then press to paste the SetUpEditor command here. Then press to have the calculator perform the command.

Figure 3 With this much data we might feel more comfortable knowing that the numbers in our calculator lists match the values in the table above. Use to open the StatEditor as shown in Figure 3. We compare these values to the first seven pairs of X and Y values in the table. Then we can move down the list to see and check the remaining values.

Figure 4 Figrue 4 shows the final pairs of values. These correspond to the final values in the atable above.

Figure 5 Now confident that we have the right values, we move to the STAT menu, and then to teh CALC sub-menu using . There we use the key to move the highlight down to the fourth item, LinReg(ax+b). At that point we can press to paste that command onto our main screen.

Figure 6 Once pasted here, we press to get the calculator to perform the task.

Figure 7 Here is the result. The equation of the line that best fits the data is given as Y = 0.6336174353*X + 8.994330155 Furthermore, the correlation coefficient for this relation is r = 0.8027371985
If the last two lines of output are not showing on your calculator it is because you do not have Diagnostics turned On.

Figure 8 We want to have the calculator graph the linear regression equation. To do this we use to move to the Y= screen. Once we are there, we could just type in the equation. However, we might as well use the features of the calculator and let it paste the equation here. We just need to find where the calculator gives us access to that equation.

Figure 9 Use to open the VARS menu, and then use to move the highlight to the Statistics option. Press to open the Statistics sub-menu.

Figure 10 At the Statistics sub-menu, use to move to the EQ sub-menu.

Figure 11 In Figure 11 we see that the first option, RegEQ, is the one we want. Press to use that option.

Figure 12 The calculator then pastes the entire regression equation into the Y= screen.

Figure 13 Just to be sure things are set up the way we want them, we press to see the STAT PLOT menu. Here we note that Plot1 is On, that it is set to do a scatter plot, and that the scatter plot will be based on the values in L₁ and L₂. This is just what we want.
If any of those settings were incorrect then we would have had to open the Plot1 menu and to make the required changes.

Figure 14 Now that everything is set we move to the ZOOM menu via the key. From long experience we know that we want to use option 9, ZoomStat even though that option is not visible on the current screen. Nonetheless, we press and the calculator continues with teh ZoomStat option.
This causes the calculator to look at the data for the plot, to set the WINDOW values appropriately, and then to move to do the actual graph.

Figure 15 In Figure 15 we see the plot of the 30 pairs of values from the original table along with the graph of the linear regresion equation.
Unlike the example in the earlier web page on this topic, the data points plotted here are not quite so close to the graphed line. We should have expected this based on the much lower value for the correlation coefficient in this example. Remeber that the closer the correlation coefficient is to 1 or ^–1 the smaller the spread of plotted points from the regression line. Our correlation coefficient in this example,
r = 0.8027371985 is good, but not great.

Figure 16 We are interested in using the regression equation to find expected values. One way to do this is to move into TRACE mode via the key.
Note that in TRACE mode the calculator first moves to trace the plotted points. Furthermoe, the calculator "traces" those points in the order in which they were given in the original data lists. Thus, the first point in the list, X=88.1 and Y=73.5 is the highlighted point. It is hard to see in a still image such as Figure 16, but the highlight is on that point, the one in the upper right corner. A comparison to the image in Figure 15 will demonstrate the slight difference present when this image was captured.

Figure 17 Press to move the trace to the graph of the regresson equation. The calculator choose a point on the line midway across the screen as its initial trace point on the line. althogu we did not ask for it, this shows that when X=63.05 then the expected value of Y is 48.943909.

Figure 18 We can force the calculator to highlight a specific point on the line by giving the calcualtor the desired X value. In Figure 18 we have entered 71.7. Once we press the key the calculator will move to that X value, position the blinking cursor on the graph of the regression equation, and give us the Y value at the bottom of the screen.

Figure 19 Figure 19 shows us that 54.4247 is the expected value when X=71.7.
Of course, looking at the original data values in the table above, we know that X=71.7 is one of those values. It is in index position number 12 in the list. The corresponding Y value is 49. On the graph this is the plotted point directly below the highlight. Thus, The observed value of X=71.7 is 49 and the expected value of X=71.7 is 54.4247. The difference, (observed) – (expected) is called the residual value at X=71.7. Using our numbers, the residual value at X=71.7 is 49 – 54.4247 = ^–5.4247.

Figure 20 In Figure 20 we have entered a new X value, 78, and pressed to get the calculator to "trace" that value. The image in Figure 20 has been altered to place an arrow showing the location of the highlighted point. From this screen we know that 58.41649 is the expected value for X=78. There is no observed value for X=78 because none of the points in the original table had an X value of 78.
The safest way to leave Figure 20 is to use the key sequence.

Figure 21 Using the original table and the information on Figure 19 we were able to determine the residual value for the one point where X=71.7. When the calculator produces the linear regression equation, via the LinReg(ax+b) command in Figure 6, it also produces a corresponding list of residual values called RESID. We want to look at the values that have been placed in that list. To do this we go to the LIST menu by using the keys. Here, in Figure 21, the calculator shows a menu of all the list names that exist on this calculator. We will use the key to go looking for and eventually highlighting the list we want, RESID.

Figure 22 We found the desired RESID. It is the highlighted name. Press to paste that name onto the main screen.

Figure 23 Once the name RESID has been pasted here, press to complete the command. This will make a copy of RESID in the list L₃. We do this so that when we re-open the editor the residual valeus will be shown without our having to change the lists that appear in the editor.
To perform the command we pressed . The calculator displayes the assigned values of the new list, L₃, but we would have a hard time following that list were we to scroll across it.

Figure 24 Instead we use to open the StatEditor. Now we see, in each row of the display, the associated X, Y, and residual values. We can scroll down and up on the list to see all 30 of the calcualted residual values.

Figure 25 In Figure 25 we have scrolled down, and then actually back up, to display and highlight the 12^th row of values, the one that contains the X=71.7 and Y=49 pair. The final column shows that the associated residual is ^–5.425, a more rounded version of the ^–5.4247 we computed in Figure 19.
Which value should we use? That depends upon the accuracy of the original measures and the demands of our assignment. In this case, looking back at the original data, the Y values are only given to one decimal place. We really should not give an expected value that is derived from the table values any more accuracy than we had in the table. Therefore, the most appropriate expression of the expected value, when X=71.7 would be Y=54.4. Had we used that expected, value in determining our residual value, we would have the residual value be ^–5.4, and we would get that answer from just rounding either of the two values we have found so far.
Having commented on the appropriate rounding for the residual values, it is clear that the calculator has no problem giving answers that imply an accuracy far more impressive than deserved.

Figure 26 We can saw that the more extensive, underserved, accuracy back in Figure 23 when the calculator showed the start of the display of the valeus assigned to L₃, the same values that were stored in the list RESID. We can look at a specific item in that list. We could look at either L₃(12) or RESID(12) to see value stored in the 12^th position of the list. In Figure 26 we formed and then executed the latter command to see the full residual value.

Figure 27 There is one more thing to examine here, namely, we want to get a scatter plot of the the X values and the residual values.
To do this we use to return to the STAT PLOT menu. We want to change the settings of Plot1. Plot1 is the highlighted plot so we just press to move to that screen.

Figure 28 In Figure 28 we have the Plot1 settings and we have moved the highlight down to the Ylist: field. The highlight appears as the character indicating that the calculator is willing to accept letters here.

Figure 29 We enter RESID as . [An alternative method to get the RESID name here would be to open the LIST menu, scroll down to find the RESID name there, and press the ENTER key at that point. Either method will work.] Once we have entered the name, we move to the next screen by pressing the key.

Figure 30 The change has been made. We are almost ready to get the graph. Before we rush to do that we realize that although the X values in the graph will be the same as those that we have been using above, the Y values are now the residual values, and those are quite different from the Y values in L₂ from the original table. Therefore, we should do a ZoomStat again to have the calculator reset the WINDOW values appropriately.

Figure 31 opens the ZOOM menu. chooses and performs the 9^th option, ZoomStat, even though it is not on the display here.

Figure 32 The graph of the X values and the associated residual values is shown in Figure 32. We note that the residuals are scattered, there is no real clumping of values, there is no pattern of these values, and there are no really extreme values. This is good. We can feel fairly confident of the appropriateness of having done a linear regression on the origianl data.

Figure 1	As usual, we run the GNRND4 program, giving it the correct key values, in order to generate the data on our calculator. The X list will be stored in L₁. The Y list will be stored in L₁.
Figure 2	The program displays the two lists of values and terminates. We go to the Stat menu, via , and then press to paste the SetUpEditor command here. Then press to have the calculator perform the command.
Figure 3	With this much data we might feel more comfortable knowing that the numbers in our calculator lists match the values in the table above. Use to open the StatEditor as shown in Figure 3. We compare these values to the first seven pairs of X and Y values in the table. Then we can move down the list to see and check the remaining values.
Figure 4	Figrue 4 shows the final pairs of values. These correspond to the final values in the atable above.
Figure 5	Now confident that we have the right values, we move to the STAT menu, and then to teh CALC sub-menu using . There we use the key to move the highlight down to the fourth item, LinReg(ax+b). At that point we can press to paste that command onto our main screen.
Figure 6	Once pasted here, we press to get the calculator to perform the task.
Figure 7	Here is the result. The equation of the line that best fits the data is given as *Y = 0.6336174353X + 8.994330155 Furthermore, the correlation coefficient for this relation is r = 0.8027371985 If the last two lines of output are not showing on your calculator it is because you do not have Diagnostics** turned On.
Figure 8	We want to have the calculator graph the linear regression equation. To do this we use to move to the Y= screen. Once we are there, we could just type in the equation. However, we might as well use the features of the calculator and let it paste the equation here. We just need to find where the calculator gives us access to that equation.
Figure 9	Use to open the VARS menu, and then use to move the highlight to the Statistics option. Press to open the Statistics sub-menu.
Figure 10	At the Statistics sub-menu, use to move to the EQ sub-menu.
Figure 11	In Figure 11 we see that the first option, RegEQ, is the one we want. Press to use that option.
Figure 12	The calculator then pastes the entire regression equation into the Y= screen.
Figure 13	Just to be sure things are set up the way we want them, we press to see the STAT PLOT menu. Here we note that Plot1 is On, that it is set to do a scatter plot, and that the scatter plot will be based on the values in L₁ and L₂. This is just what we want. If any of those settings were incorrect then we would have had to open the Plot1 menu and to make the required changes.
Figure 14	Now that everything is set we move to the ZOOM menu via the key. From long experience we know that we want to use option 9, ZoomStat even though that option is not visible on the current screen. Nonetheless, we press and the calculator continues with teh ZoomStat option. This causes the calculator to look at the data for the plot, to set the WINDOW values appropriately, and then to move to do the actual graph.
Figure 15	In Figure 15 we see the plot of the 30 pairs of values from the original table along with the graph of the linear regresion equation. Unlike the example in the earlier web page on this topic, the data points plotted here are not quite so close to the graphed line. We should have expected this based on the much lower value for the correlation coefficient in this example. Remeber that the closer the correlation coefficient is to 1 or ^–1 the smaller the spread of plotted points from the regression line. Our correlation coefficient in this example, r = 0.8027371985 is good, but not great.
Figure 16	We are interested in using the regression equation to find expected values. One way to do this is to move into TRACE mode via the key. Note that in TRACE mode the calculator first moves to trace the plotted points. Furthermoe, the calculator "traces" those points in the order in which they were given in the original data lists. Thus, the first point in the list, X=88.1 and Y=73.5 is the highlighted point. It is hard to see in a still image such as Figure 16, but the highlight is on that point, the one in the upper right corner. A comparison to the image in Figure 15 will demonstrate the slight difference present when this image was captured.
Figure 17	Press to move the trace to the graph of the regresson equation. The calculator choose a point on the line midway across the screen as its initial trace point on the line. althogu we did not ask for it, this shows that when X=63.05 then the expected value of Y is 48.943909.
Figure 18	We can force the calculator to highlight a specific point on the line by giving the calcualtor the desired X value. In Figure 18 we have entered 71.7. Once we press the key the calculator will move to that X value, position the blinking cursor on the graph of the regression equation, and give us the Y value at the bottom of the screen.
Figure 19	Figure 19 shows us that 54.4247 is the expected value when X=71.7. Of course, looking at the original data values in the table above, we know that X=71.7 is one of those values. It is in index position number 12 in the list. The corresponding Y value is 49. On the graph this is the plotted point directly below the highlight. Thus, The observed value of X=71.7 is 49 and the expected value of X=71.7 is 54.4247. The difference, (observed) – (expected) is called the residual value at X=71.7. Using our numbers, the residual value at X=71.7 is 49 – 54.4247 = ^–5.4247.
Figure 20	In Figure 20 we have entered a new X value, 78, and pressed to get the calculator to "trace" that value. The image in Figure 20 has been altered to place an arrow showing the location of the highlighted point. From this screen we know that 58.41649 is the expected value for X=78. There is no observed value for X=78 because none of the points in the original table had an X value of 78. The safest way to leave Figure 20 is to use the key sequence.
Figure 21	Using the original table and the information on Figure 19 we were able to determine the residual value for the one point where X=71.7. When the calculator produces the linear regression equation, via the LinReg(ax+b) command in Figure 6, it also produces a corresponding list of residual values called RESID. We want to look at the values that have been placed in that list. To do this we go to the LIST menu by using the keys. Here, in Figure 21, the calculator shows a menu of all the list names that exist on this calculator. We will use the key to go looking for and eventually highlighting the list we want, RESID.
Figure 22	We found the desired RESID. It is the highlighted name. Press to paste that name onto the main screen.
Figure 23	Once the name RESID has been pasted here, press to complete the command. This will make a copy of RESID in the list L₃. We do this so that when we re-open the editor the residual valeus will be shown without our having to change the lists that appear in the editor. To perform the command we pressed . The calculator displayes the assigned values of the new list, L₃, but we would have a hard time following that list were we to scroll across it.
Figure 24	Instead we use to open the StatEditor. Now we see, in each row of the display, the associated X, Y, and residual values. We can scroll down and up on the list to see all 30 of the calcualted residual values.
Figure 25	In Figure 25 we have scrolled down, and then actually back up, to display and highlight the 12^th row of values, the one that contains the X=71.7 and Y=49 pair. The final column shows that the associated residual is ^–5.425, a more rounded version of the ^–5.4247 we computed in Figure 19. Which value should we use? That depends upon the accuracy of the original measures and the demands of our assignment. In this case, looking back at the original data, the Y values are only given to one decimal place. We really should not give an expected value that is derived from the table values any more accuracy than we had in the table. Therefore, the most appropriate expression of the expected value, when X=71.7 would be Y=54.4. Had we used that expected, value in determining our residual value, we would have the residual value be ^–5.4, and we would get that answer from just rounding either of the two values we have found so far. Having commented on the appropriate rounding for the residual values, it is clear that the calculator has no problem giving answers that imply an accuracy far more impressive than deserved.
Figure 26	We can saw that the more extensive, underserved, accuracy back in Figure 23 when the calculator showed the start of the display of the valeus assigned to L₃, the same values that were stored in the list RESID. We can look at a specific item in that list. We could look at either L₃(12) or RESID(12) to see value stored in the 12^th position of the list. In Figure 26 we formed and then executed the latter command to see the full residual value.
Figure 27	There is one more thing to examine here, namely, we want to get a scatter plot of the the X values and the residual values. To do this we use to return to the STAT PLOT menu. We want to change the settings of Plot1. Plot1 is the highlighted plot so we just press to move to that screen.
Figure 28	In Figure 28 we have the Plot1 settings and we have moved the highlight down to the Ylist: field. The highlight appears as the character indicating that the calculator is willing to accept letters here.
Figure 29	We enter RESID as . [An alternative method to get the RESID name here would be to open the LIST menu, scroll down to find the RESID name there, and press the ENTER key at that point. Either method will work.] Once we have entered the name, we move to the next screen by pressing the key.
Figure 30	The change has been made. We are almost ready to get the graph. Before we rush to do that we realize that although the X values in the graph will be the same as those that we have been using above, the Y values are now the residual values, and those are quite different from the Y values in L₂ from the original table. Therefore, we should do a ZoomStat again to have the calculator reset the WINDOW values appropriately.
Figure 31	opens the ZOOM menu. chooses and performs the 9^th option, ZoomStat, even though it is not on the display here.
Figure 32	The graph of the X values and the associated residual values is shown in Figure 32. We note that the residuals are scattered, there is no real clumping of values, there is no pattern of these values, and there are no really extreme values. This is good. We can feel fairly confident of the appropriateness of having done a linear regression on the origianl data.