major upload of (python) course material & solutions

This commit is contained in:
2025-12-03 14:39:45 +01:00
parent 52552e20cb
commit e95a0b2ecc
39 changed files with 13598 additions and 0 deletions

View File

@@ -0,0 +1,184 @@
{
"cells": [
{
"cell_type": "raw",
"id": "6cbef61b-0897-42bf-b456-c0a409b87c41",
"metadata": {},
"source": [
"\\vspace{-4cm}\n",
"\\begin{center}\n",
" \\LARGE{Machine Learning for Economics and Finance}\\\\\n",
" \\Large{Task 1: Logistic Regressions}\\\\[0.5cm]\n",
" \\Large{\\textbf{02\\_Default\\_data}}\\\\[1.0cm]\n",
" \\large{Ole Wilms}\\\\[0.5cm]\n",
" \\large{July 29, 2024}\\\\\n",
"\\end{center}"
]
},
{
"cell_type": "raw",
"id": "13be77f3-44f0-4983-b4cb-bd3e4b5dba8b",
"metadata": {},
"source": [
"\\setcounter{secnumdepth}{0}"
]
},
{
"cell_type": "markdown",
"id": "72f918a4-cdd4-4b46-a88f-f4b43c3c3a88",
"metadata": {
"tags": [],
"user_expressions": []
},
"source": [
"## Task 1: Logistic Regressions"
]
},
{
"cell_type": "markdown",
"id": "0b3f9fc6-db4f-47b0-9dfa-e41d9f85a5ba",
"metadata": {
"tags": [],
"user_expressions": []
},
"source": [
"1.1 Randomly split the data into $7000$ observations for training and $3000$ observations for testing and set the seed to $1$ before sampling the data. Call these two datasets *train_data* and *test_data* respectively. (Hint: use the code to split the data from 01 Auto_data_2.R or Auto_data_2.Rmd)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "335aa198-5a94-4c5a-8ad8-67c78bcf71f5",
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"id": "116c466d-0627-43d6-adbe-a937ac846a28",
"metadata": {
"tags": [],
"user_expressions": []
},
"source": [
"1.2 Fit a logistic regression of default on *income* using the *train_data*. Analyze the significance of\n",
"the estimated coefficients."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "2e38a201-7f2d-4999-beab-5739217a9318",
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"id": "43c6dade-5a22-476a-b3bf-bfd1b880038d",
"metadata": {
"tags": [],
"user_expressions": []
},
"source": [
"1.3 Compute the *out-of-sample accuracy* and *error rate* and compare to the *in-sample statistics*. Do\n",
"you think this is a good model to predict default?"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "44028726-1eff-436f-bc47-04a6786ae3ad",
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"id": "c28971ef-8bee-462d-9612-88f1534bfcb5",
"metadata": {
"tags": [],
"user_expressions": []
},
"source": [
"1.4 Add balance as a predictor and compute the *out-of-sample error rate* and *accuracy*. Do you\n",
"think this is a good model to predict *default*?"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "3a7216df-adf5-4df0-9593-69c1a7649f64",
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"id": "f267ef66-1775-42a8-a1e9-45fda849f4d9",
"metadata": {
"tags": [],
"user_expressions": []
},
"source": [
"1.5 Compare the results for Task $1.4$ to a model with only balance as a predictor. Which model\n",
"would you choose?"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "28082bd5-8fe1-4160-aec0-1a92aebfa671",
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"id": "7ccad70f-5ef5-42c8-8c2e-22e76943d281",
"metadata": {
"tags": [],
"user_expressions": []
},
"source": [
"1.6 Take the model from Task $1.4$ but now re-estimate the model using different *seeds* to draw your\n",
"*training* and *test data*. Does your *test error rate* change with the seed? Whats going on here?"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "9ab2f559-83b1-4a66-b1dc-8799b8301d85",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"date": " ",
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.13.1"
},
"title": " ",
"toc-autonumbering": false,
"toc-showcode": false,
"toc-showmarkdowntxt": false,
"toc-showtags": false
},
"nbformat": 4,
"nbformat_minor": 5
}

File diff suppressed because one or more lines are too long