Derive the optimal linear predictor in the vector case.
the optimal linear predictor for vector data (without proof). Here, let us derive the optimallinear predictor in the vector case. Suppose that {21, 12, . .., In } denote training data samples,where each r; E Rd. Suppose {y1, y2, . .., Un } denote corresponding (scalar) labels.a. Show that the mean squared-error loss function for multivariate linear regression can bewritten in the following form:MSE(w) = ly – Xw/3where X is an n x (d + 1) matrix and where the first column of X is all-ones. What isthe dimension of w and y? What do the coordinates of w represent?b. Theoretically prove that the optimal linear regression weights are given by:w = (XX) ‘xTy.What algebraic assumptions on X did you make in order to derive the above closed form?