Correlation Analysis
(Note : The code pertaining this page is "Correlation_Analysis.py" on our submitted github)
As we mentioned before, Correlation was utilized to find the dependency for two different random variables. In our project, we chose four different places to collect data
1. Indoor (reading room) 2. Outdoor 3. Kitchen 4. Bathroom
And pick three attributes that would be related to air quality
1. Dust concnetration 2. air quality index 3. Temperature
We tried to see through the dependencies for each two of these three attributes in the four places.
After applying python spicy.stat statistic library for these data points, the output with data visualization and its respective correlation coefficients are shown below :
As we mentioned before, Correlation was utilized to find the dependency for two different random variables. In our project, we chose four different places to collect data
1. Indoor (reading room) 2. Outdoor 3. Kitchen 4. Bathroom
And pick three attributes that would be related to air quality
1. Dust concnetration 2. air quality index 3. Temperature
We tried to see through the dependencies for each two of these three attributes in the four places.
After applying python spicy.stat statistic library for these data points, the output with data visualization and its respective correlation coefficients are shown below :
For data collected in kitchen, it seems it's difficult for us to find a strong dependency for any two of these three attributes, even though the correlation coefficient for dust and temperature is 0.126, its not strong enough to conclude theses two variables are dependent. In most times, the data points are scattered.
For data collected in outdoors, we notices there were negative correlations for temperature and air quality, but just like what we saw in kitchen, the figure was not strong enough to endorse a dependency for these tow attributes.
As for bathroom, something interesting happened. The correlation for temperature and dust is nearly 0.25, we can say there is some dependency for this two attributes in bathroom. It makes sense because when taking a shower, the temperature would rise with air full of steam, and the dust sensor considers water vapor as "dust", so when the hot water increases the temperature, the dust concentration also arises, making these two variables dependent in bathroom.
For data collected indoor, we can find an obvious dependency among air quality and dust concentration. This is sort of hard to explain, maybe because when the window was closing, the dust concentration was low and so did air quality. When opening the window, the dust outside the house containing harmful gas floated into our room, making these two attributes decrease together. In such case, the correlation coefficient would become larger as wee saw on the table.