optimal linear predictor for vector data

the optimal linear predictor for vector data (without proof). Here, let us derive the optimallinear predictor in the vector case. Suppose that {21, 12, . .., In } denote training data samples,where each r; E Rd. Suppose {y1, y2, . .., Un } denote corresponding (scalar) labels.a. Show that the mean squared-error loss function for multivariate linear regression can bewritten in the following form:MSE(w) = ly – Xw/3where X is an n x (d + 1) matrix and where the first column of X is all-ones. What isthe dimension of w and y? What do the coordinates of w represent?b. Theoretically prove that the optimal linear regression weights are given by:w = (XX) ‘xTy.What algebraic assumptions on X did you make in order to derive the above closed form?