Part 8 (5 points, coding task)
Do the following tasks in this part.
-
Define a function called
my_train_test_splitthat splits the whole dataset into the training component and the test/validation component.-
The split is random
-
Inputs
-
X: A DataFrame object of features of all sample data. -
y: A Series object of labels of all sample data. -
test_size: It takes a value between 0 and 1 that denotes the fraction of samples used for testing. That is, the number of samples used for testing isint(total number of samples * test_size).
-
-
Outputs
-
X_train: It keeps samples inXfor training. -
X_test: It keeps samples inXfor testing. -
y_train: It keeps samples inyfor training. -
y_test: It keeps samples inyfor testing.
-
-
-
Call this function with inputs
-
X = X -
y = y -
test_state = 0.2
-
-
Print object types and shapes of
X_train,X_test,y_train,y_test.