Skip to content
Snippets Groups Projects
Commit 2f19575e authored by Heurich's avatar Heurich
Browse files

Add first assignment and intros

parent 7e37802b
Branches
Tags
No related merge requests found
%% Cell type:markdown id:484313d8-8a62-44ed-801e-6e2d903194e7 tags:
# Assignment Sheet 1
%% Cell type:markdown id:2dbfe254-4f29-4fbe-a465-3cd86ac7dfb4 tags:
## Task 4 - Bivariate Descriptors
%% Cell type:markdown id:52435766-3f59-42f8-aea4-fbddc841d4e6 tags:
This notebook complements task 4 using the pandas library.
The comments in each cell describe each small task you should complete. :)
%% Cell type:code id:3c55209e-7c80-4e45-8ee7-1d49f9c0b1c9 tags:
``` python
import pandas as pd
```
%% Cell type:markdown id:80fcaacc-4a91-43f3-bd05-315769c3e8be tags:
### a) Heart Dataset
%% Cell type:code id:4822bf49-37e8-49c8-8bb0-658b7bca611b tags:
``` python
# Dataset: https://www.kaggle.com/ronitf/heart-disease-uci/download
df = pd.read_csv('your/path/to/heart.csv')
```
%% Cell type:code id:d17d2f78-21fe-495f-8501-c6dc7d5bdec0 tags:
``` python
# Take the first 8 rows of the dataframe
small_df = df.head(8)
small_df
```
%% Cell type:code id:f8d417c2-426d-4ce4-ac83-ac94b765107c tags:
``` python
# Check std of each feature of the partial dataframe
```
%% Cell type:code id:c0cc68a9-f0d4-4f53-97f7-cdff7358fbfd tags:
``` python
# Check mean of each feature of the partial dataframe
```
%% Cell type:code id:477dbb53-3570-4287-8367-98d711ad280e tags:
``` python
# Calculate correlation for each feature in the dataframe
```
%% Cell type:code id:2e90cd02-3152-435b-98ee-05bc18c6c715 tags:
``` python
```
%% Cell type:markdown id:c5d4157a-f01d-49e2-bbc5-4a9c47572b6c tags:
### b) Titanic Dataset (train.csv)
%% Cell type:code id:974508fe-0a81-444d-b296-2ce613d1df3c tags:
``` python
# Dataset: https://www.kaggle.com/c/titanic/data?select=train.csv
df = pd.read_csv('your/path/to/titanic/train.csv')
```
%% Cell type:code id:76a8b05d-9afc-4bef-b31d-f94853810391 tags:
``` python
# Take the first 16 rows of the dataframe
```
%% Cell type:code id:9f069ec6-900b-4004-adcb-a2da8b95ff4e tags:
``` python
# Show number of instances for male/female passengers
```
%% Cell type:code id:0d9082fc-9da0-4e5b-8f0b-b681d46b7813 tags:
``` python
# Show number of instances for each passenger class
```
%% Cell type:code id:c7a1ea7e-c708-417f-9d8f-e19448b0880b tags:
``` python
# Filter and print every contingency table entry for the Chi-Square calculation
# The features to check are 'Sex' and 'Pclass', equally to Task 4 on the exercise sheet
```
%% Cell type:code id:643731b9-01e7-40af-90e3-5dc6bd2415ae tags:
``` python
# Calculate and print your Chi-Square solution for the
## Expected Values
### male and class1
m_c1 =
### male and class2
m_c2 =
### male and class3
m_c3 =
### female and class1
f_c1 =
### female and class2
f_c2 =
### female and class3
f_c3 =
print('Expected values of the contingency matrix')
print(m_c1,' | ', f_c1)
print('--'*8)
print(m_c2,' | ', f_c2)
print('--'*8)
print(m_c3,' | ', f_c3)
## chi-square calculation
chi_square =
print('--'*8)
print('--'*8)
print('X^2 = ', chi_square)
```
%% Cell type:code id:794dc6f4-7b1e-44bf-b704-69291dc173b6 tags:
``` python
```
Source diff could not be displayed: it is too large. Options to address this: view the blob.
This diff is collapsed.
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment