Skip to content
Snippets Groups Projects
Commit 2f19575e authored by Heurich's avatar Heurich
Browse files

Add first assignment and intros

parent 7e37802b
No related branches found
No related tags found
No related merge requests found
%% Cell type:markdown id:484313d8-8a62-44ed-801e-6e2d903194e7 tags:
# Assignment Sheet 1
%% Cell type:markdown id:2dbfe254-4f29-4fbe-a465-3cd86ac7dfb4 tags:
## Task 4 - Bivariate Descriptors
%% Cell type:markdown id:52435766-3f59-42f8-aea4-fbddc841d4e6 tags:
This notebook complements task 4 using the pandas library.
The comments in each cell describe each small task you should complete. :)
%% Cell type:code id:3c55209e-7c80-4e45-8ee7-1d49f9c0b1c9 tags:
``` python
import pandas as pd
```
%% Cell type:markdown id:80fcaacc-4a91-43f3-bd05-315769c3e8be tags:
### a) Heart Dataset
%% Cell type:code id:4822bf49-37e8-49c8-8bb0-658b7bca611b tags:
``` python
# Dataset: https://www.kaggle.com/ronitf/heart-disease-uci/download
df = pd.read_csv('your/path/to/heart.csv')
```
%% Cell type:code id:d17d2f78-21fe-495f-8501-c6dc7d5bdec0 tags:
``` python
# Take the first 8 rows of the dataframe
small_df = df.head(8)
small_df
```
%% Cell type:code id:f8d417c2-426d-4ce4-ac83-ac94b765107c tags:
``` python
# Check std of each feature of the partial dataframe
```
%% Cell type:code id:c0cc68a9-f0d4-4f53-97f7-cdff7358fbfd tags:
``` python
# Check mean of each feature of the partial dataframe
```
%% Cell type:code id:477dbb53-3570-4287-8367-98d711ad280e tags:
``` python
# Calculate correlation for each feature in the dataframe
```
%% Cell type:code id:2e90cd02-3152-435b-98ee-05bc18c6c715 tags:
``` python
```
%% Cell type:markdown id:c5d4157a-f01d-49e2-bbc5-4a9c47572b6c tags:
### b) Titanic Dataset (train.csv)
%% Cell type:code id:974508fe-0a81-444d-b296-2ce613d1df3c tags:
``` python
# Dataset: https://www.kaggle.com/c/titanic/data?select=train.csv
df = pd.read_csv('your/path/to/titanic/train.csv')
```
%% Cell type:code id:76a8b05d-9afc-4bef-b31d-f94853810391 tags:
``` python
# Take the first 16 rows of the dataframe
```
%% Cell type:code id:9f069ec6-900b-4004-adcb-a2da8b95ff4e tags:
``` python
# Show number of instances for male/female passengers
```
%% Cell type:code id:0d9082fc-9da0-4e5b-8f0b-b681d46b7813 tags:
``` python
# Show number of instances for each passenger class
```
%% Cell type:code id:c7a1ea7e-c708-417f-9d8f-e19448b0880b tags:
``` python
# Filter and print every contingency table entry for the Chi-Square calculation
# The features to check are 'Sex' and 'Pclass', equally to Task 4 on the exercise sheet
```
%% Cell type:code id:643731b9-01e7-40af-90e3-5dc6bd2415ae tags:
``` python
# Calculate and print your Chi-Square solution for the
## Expected Values
### male and class1
m_c1 =
### male and class2
m_c2 =
### male and class3
m_c3 =
### female and class1
f_c1 =
### female and class2
f_c2 =
### female and class3
f_c3 =
print('Expected values of the contingency matrix')
print(m_c1,' | ', f_c1)
print('--'*8)
print(m_c2,' | ', f_c2)
print('--'*8)
print(m_c3,' | ', f_c3)
## chi-square calculation
chi_square =
print('--'*8)
print('--'*8)
print('X^2 = ', chi_square)
```
%% Cell type:code id:794dc6f4-7b1e-44bf-b704-69291dc173b6 tags:
``` python
```
Source diff could not be displayed: it is too large. Options to address this: view the blob.
This diff is collapsed.
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment