Problem 11_20, pg.590

 

            The manager of a certain commuter rail system wants to determine which factors have a significant impact on the demand for rides in the large city served by the transportation network.  The variables that he wants to relate to the number of weekly riders on the city’s rail are price per ride, population of the city, disposable income per capita of the citizens and the parking rate in the city parking lots.

            As I look at each variable, some of them are expected to be positively related to the number of riders per week by their nature.  Others are in no way, or seem not to be, related to weekly rider-ship.  First, examining price per ride, compared to the number of weekly riders, one would expect this to be a positive linear relationship, and therefore expect the signs of the coefficient to be positive.  The second variable, population of the city, is one I would expect to have a negative relationship in one sense, and a positive relationship in another.  The expected sign of the coefficient at first glance of the data seems to be negative.  At first, in 1966, over half of the population is riding the rail system on a weekly basis, and as the years go by, just a little less, and then a little more than half the population is riding the commuter rail.  This change could be more attributable to the increased popularity and affordability of cars than being directly related to the number of weekly riders.  Next, the data on income suggest a positive linear relationship between the variables.  As disposable income per capita rises, the number of weekly rail riders decreases, again implying that possibly with the introduction of affordable cars or other forms of transportation, less people rode the city rail system.  The last variable is weekly rider-ship compared to the parking rate of city parking lots.  As the price of parking goes up, fewer citizens are riding the rail (not related to number of weekly riders), which again can be attributed to other more affordable modes of transportation, and an expected positive sign of the coefficient.

 

Results of multiple regression for Weekly_Riders

 

 

 

 

 

 

 

 

 

 

 

Summary measures

 

 

 

 

 

 

 

Multiple R

0.9776

 

 

 

 

 

 

R-Square

0.9557

 

 

 

 

 

 

Adj R-Square

0.9477

 

 

 

 

 

 

StErr of Est

21.4867

 

 

 

 

 

 

 

 

 

 

 

 

 

ANOVA Table

 

 

 

 

 

 

 

Source

df

SS

MS

F

p-value

 

 

Explained

4

219260.4797

54815.1199

118.7301

0.0000

 

 

Unexplained

22

10156.9277

461.6785

 

 

 

 

 

 

 

 

 

 

 

Regression coefficients

 

 

 

 

 

 

 

Coefficient

Std Err

t-value

p-value

Lower limit

Upper limit

 

Constant

124.4269

516.7803

0.2408

0.8120

-947.3109

1196.1648

 

Price_per_Ride

-166.9641

52.0106

-3.2102

0.0040

-274.8275

-59.1006

 

Population

0.6210

0.2751

2.2570

0.0343

0.0504

1.1915

 

Income

-0.0472

0.0129

-3.6572

0.0014

-0.0740

-0.0204

 

Parking_Rate

194.6798

36.6143

5.3170

0.0000

118.7463

270.6133

 

 

 

 

 

 

 

 

Results of multiple regression for Weekly_Riders

 

 

 

 

 

 

 

 

 

 

 

 

 

Summary measures

 

 

 

 

 

 

 

 

Multiple R

0.9724

 

 

 

 

 

 

 

R-Square

0.9455

 

 

 

 

 

 

 

Adj R-Square

0.9384

 

 

 

 

 

 

 

StErr of Est

23.3208

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

ANOVA Table

 

 

 

 

 

 

 

 

Source

df

SS

MS

F

p-value

 

 

 

Explained

3

216908.6379

72302.8793

132.9440

0.0000

 

 

 

Unexplained

23

12508.7695

543.8595

 

 

 

 

 

 

 

 

 

 

 

 

 

Regression coefficients

 

 

 

 

 

 

 

 

Coefficient

Std Err

t-value

p-value

Lower limit

Upper limit

 

 

Constant

1289.6821

24.6300

52.3622

0.0000

1238.7311

1340.6331

 

 

Price_per_Ride

-203.7555

53.6060

-3.8010

0.0009

-314.6477

-92.8632

 

 

Income

-0.0691

0.0093

-7.4576

0.0000

-0.0882

-0.0499

 

 

Parking_Rate

212.0409

38.8528

5.4575

0.0000

131.6679

292.4140

 


 

 

For first regression table, #Weekly Riders= 124.43-166.96Price per Ride+0.621Population- 0.047Income+ 194.68Parking Rate

 

For second regression table, #Weekly Riders=1289.68- 203.76Price per Ride- 0.069Income+ 212.04Parking Rate

 

 

Table of correlations

 

 

 

 

 

 

 

Weekly_Riders

Price_per_Ride

Population

Income

Parking_Rate

 

Weekly_Riders

1.000

 

 

 

 

 

Price_per_Ride

-0.896

1.000

 

 

 

 

Population

0.946

-0.936

1.000

 

 

 

Income

-0.934

0.944

-0.971

1.000

 

 

Parking_Rate

-0.825

0.955

-0.919

0.949

1.000

 

           

 

After getting the regression coefficients and the correlations, I graphed the relationships to see how they presented themselves.

 

           

 

 

 

After examining the variables in graph form, I found that the coefficients did not resemble what I thought the data suggested.  I did multiple regression analyses with and without the population variable.  At first, including the population variable, population and parking rate are the only positive coefficient variables.  The other two, price per ride and income have negative coefficients.  When I excluded the population variable (because I thought this was the variable that would have no relationship with the dependent variable number of weekly riders), I found that price per ride and income still have negative coefficients.  But the P-values of these coefficients suggest that they should still be included in the analysis because they are all less than .05.

            The correlations of the coefficients for price per ride, population, income, and parking rate are -0.896, 0.946, -0.934, and -0.825 respectively.  These all suggest that there are strong relationships between the dependent and each independent variable. 

            Although this data does not show at all what was expected from the variables, only 5% and 6% respectively of the total variation in the number of weekly riders is not explained by the estimated multiple regression model.