This homework involves using the Python libraries we have learned in class so far (or others if you want) to train a model to predict the median house value for California districts, expressed in hundreds of thousands of dollars ($100,000). The model will take in 8 input features, described here.
Create a copy of this Colab notebook. While you should understand the entire notebook, your task is to fill in the get_regression_model()
function only.
Submit a PDF on Laulima which describes your approach. Your report should discuss a summary of your best-performing submitted solution, a summary of the other machine learning models and methods that you attempted, your conjectures about why your best-performing solution outperformed other models, and a summary of hyperparameter tuning process. Include a publicly accessible link to your notebook copy for this problem in your submitted homework PDF.
Important Clarification on Extra Credit: We will determine extra credit through a dataset not explicitly provided to you in the notebook. If you want extra credit, make sure that your model performance is consistent regardless of the subset of the data used for the test set. Notice in the second code cell how we only use the first 50% of the housing data to generate the train and test sets. The final MSE used for extra credit will come from averaging over various subsets of the remaining 50% of the housing data.
5 points: Coding Assignment
5 points: Written Report
Extra credit
The top-scoring submission (on a held-out test dataset not provided to you) will receive 5 extra credit points.
The 2nd place submission will receive 4 extra credit points.
The 3rd through 5th place submissions will receive 3 extra credit points.
The 6th through 10th place submission will receive 2 extra credit points.
The 11th through 15th place submission will receive 1 extra credit point.
Submit a PDF on Laulima with your writeup and link to your Colaboratory notebook. All of your code must be included in your submitted PDF. Make sure that your Colab notebook is publicly accessible.