Wednesday, July 17, 2019
Sta302 – Assignment 2
From the Scatter dapple of Revenue vs. Circulation, it can be seen that the variance of the dependent variable, Revenue, is increasing. This is a  assault of the Gauss-Markov condition of constant variance in the error terms. Also, since a linear  kind seems appropriate, transformation of  some(prenominal) the dependent and  unconditional variables  atomic number 18 necessary. 2) Fitting polynomial  works to the  data whitethorn be better than  accommodate a straight line model to the untransformed data because this allows for  curve ball and can  conciliate the data more closely.However, this might  non be sufficient because it does  non  bank  none for nonconstant variance. 3) The  subjective  log transformation of  some(prenominal) variables provides the  outgo model of the three. From the plot of the retrogression Line for lnRevenue vs. lnCirculation, it can be seen that the points  be relatively equally scattered  just about the regression line. Also, the nonconstant variance se   ems to be  placemented. This is  unmistakable in the plot of the residuals vs. predicted values, as the points are randomly scattered about the  revolve around line.The square  answer transformation of both variables improves linearity, as indicated in the plot of the  turnabout Line for sqrtRevenue vs. sqrtCirculation, but does not fix the problem of non-constant variance. This can be  clear seen in the plot of the residuals vs. predicted values. The points are not randomly scattered around the  tenderness line, but seem to be bunched up on the left side and  give outwards, indicating increasing variance. The inverse transformation of both variables does not improve linearity, as curvature can be seen in the plot of the Regression Line for invRevenue vs. invCirculation.Although non-constant variance is  roughly improved over the square root transformation, as can be seen in the plot of the residuals vs. predicted values, it is still insufficient. Therefore, both variables natural l   og transformed seems to be the best model of the three choices. 4) The model  utilise is . This implies that . From this result, it can be seen that a k-fold  transform in the circulation in millions results in a  variety in revenue in thousands of dollars. From the regression, =0. 5334. This means that if circulation changes by a  means of k, its revenue will also change by a factor of k0. 334. 5) From SAS, a 95%  divination interval with a circulation of 1 million for the natural log of the revenue is (4. 3005, 5. 0202) with a predicted value of 4. 6604.This translates to a  expectancy interval of ($73 736. 65, $151 441. 59) with a predicted revenue of $105 678. 35. 6) Since the threshold for  realises D is 4/(n-2), where n=70, the threshold is 0. 059. There are  fin values with Cooks D greater than 0. 059, which indicates that they are influential points. From the  blueprint Q-Q plot of the residuals, these 5 points can be seen to be utliers at the ends of the graph. Therefore, t   hey can greatly affect the fit of the model. Also from the  chemical formula Q-Q plot, it can be seen that the residuals are not exactly normally distributed. The curvature at the ends of the plot indicates heavy  go after in the distribution. By the Central Limit Theorem  cartel intervals, and the values for , , and E(Y) are valid. However, since a prediction interval deals only with a  champion point, it is not valid. Due to the heavy tails in the distribution of the error terms, the prediction interval calculated in 5) may not be accurate.  
Subscribe to:
Post Comments (Atom)
 
 
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.