COVID Vaccination Data Analysis
COVID-19 Vaccination Data Analysis

The Covid-19 pandemic is the most critical health disaster to hit the world. Predicting the trend of COVID-19 vaccination has become a challenge. Many healthcare professionals, statisticians, and researchers are tracking the spread of the virus in different parts of the world using a variety of approaches. The increase in a variety of vaccines developed by talented scientists and the vaccination process sparked curiosity to learn more about current immunization programs and a keen interest in finding meaningful information from the data. After looking up multiple websites, I found a few datasets to work with.
The complete code is available on GitHub along with the dataset.
Data Collection
First, we will import libraries and the dataset. The dataset we’ll use for this project- we’ll call it cowin-2.csv .
Required packages:(pandas ,matplotlib,seaborn)
Basic Analysis
Data Cleaning:
Data Visualization (Note that the graphs are plotted for values of a single day)
1.Which state has the highest number of vaccinated people?
Let us find out the state having the most vaccinated people. Here we have used seaborn and matplotlib to plot the data. For plotting, we need a set of values from the data to be arranged in an ascending or descending order. It can be achieved by using methods like groupby(), max(), sort_values(), etc. For easy visualization, we will consider the top 10 districts from the dataset.
From the visualization above, it is clear that Maharashtra leads the country in terms of the number of doses of vaccination.
The states having the highest number of individuals vaccinated are Maharashtra ,UP,Rajasthan ,Gujarat and Karnataka.
2. Gender wise distribution of vaccinations
Number of males vaccinated are greater than the number of transgenders and females being vaccinated.(Maybe because there are more number of males infected with corona as compared to females).
Gender wise (district) Individuals Vaccinated
Mumbai and Pune are some of the cities having the highest number of individuals vaccinated followed by cities like Bengaluru, Chennai, Ahmedabad, Thane and Kolkata.
2. Gender wise distribution of vaccinations
3. What are the different vaccines used by different districts?
Total Sessions conducted and
Individuals Vaccinated
Linear Regression Algorithm :
After cleaning and visualizing the data we will move onto the selection, training, and testing of the algorithm- Linear regression .We are predicting the total number of individuals vaccinated on any day given a set of values for several days as the training data.LinearRegressionlibrary: sklearn module : linear_model class :LinearRegressionTraining(or learning part):Fit the training data to the algorithm Model_Name.fit(Training) Testing (check the efficiency of the algorithm): This step includes -
1.Predicting the outcomes of new data
Model_Name.predict(Features-of-testing-set)
2.Checking the accuracy of the algorithm(testing set)
Model_Name.score(Arrays-of-testing-set)
Inferences and Conclusion
From the above analysis and visualizations, we can conclude that:
- The rate of applying vaccines to patients is highest in Maharashtra followed by Karnataka and Tamil Nadu.
- India uses Covaxin for vaccinating its citizens and CoviShield is the more popular than any other vaccine used.
- Number of males vaccinated are greater than the number of transgenders and females being vaccinated.
References:
https://github.com/covid19india/api
https://towardsdatascience.com/covid-19-vaccination-progress-analysis-around-the-world-736d7e57f198
https://jakevdp.github.io/PythonDataScienceHandbook/04.14-visualization-with-seaborn.html
From the above analysis and visualizations, we can conclude that:
- The rate of applying vaccines to patients is highest in Maharashtra followed by Karnataka and Tamil Nadu.
- India uses Covaxin for vaccinating its citizens and CoviShield is the more popular than any other vaccine used.
- Number of males vaccinated are greater than the number of transgenders and females being vaccinated.
References:
https://github.com/covid19india/api
https://towardsdatascience.com/covid-19-vaccination-progress-analysis-around-the-world-736d7e57f198
https://jakevdp.github.io/PythonDataScienceHandbook/04.14-visualization-with-seaborn.html
Comments
Post a Comment