Machine Learning - Correlation Matrix Plot
Machine Learning - Correlation Matrix Plot - Correlation is an indication of the changes between two variables. In our previous chapters, we have discussed Pearsons Correlation coefficients
Correlation is an indication of the changes between the two variables. In our previous chapters, we have discussed Pearson’s Correlation coefficients and the importance of Correlation too. We can plot a correlation matrix to show which variable is having a high or low correlation with respect to another variable.
In the following example, Python script will generate and plot a correlation matrix for the Pima Indian Diabetes dataset. It can be generated with the help of corr() function on Pandas DataFrame and plotted with the help of pyplot.
from matplotlib import pyplot from pandas import read_csv import numpy Path = r"C:\pima-indians-diabetes.csv" names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class'] data = read_csv(Path, names = names) correlations = data.corr() fig = pyplot.figure() ax = fig.add_subplot(111) cax = ax.matshow(correlations, vmin=-1, vmax=1) fig.colorbar(cax) ticks = numpy.arange(0,9,1) ax.set_xticks(ticks) ax.set_yticks(ticks) ax.set_xticklabels(names) ax.set_yticklabels(names) pyplot.show()
From the above output of the correlation matrix, we can see that it is symmetrical i.e. the bottom left is the same as the top right. It is also observed that each variable is positively correlated with each other.