Brazil Elections 2018

Python scraping

Posted by August 17, 2021 · 2 mins read

Project:

In this Project I decided to understand better the data from Brazilian elections 2018.

Dataset.

The dataset has pieces of information as the age of candidates, instructions, gender, marital status, and expenses.

elections-columns
Click here to see my Python Project.

The first part of the job was to decide which data I could work and then delete any outlier that impacts the data results.

Check the age was the first column to analyze. At that moment I found problem one data-entry error. Because one of the politicians has 825 years. The first part of the job was to decide which data I could work and then delete any outlier that impacts the data results.

To confirm that I don't have any outlier, Create any bar chart that you can visualize other possibles outliers.

elections-charts

This specific outlier case impacted and shifted the entire chart analysis to the left, in this case, the decision was to delete the entire column to reduce the impact for analyzes such as the average.

Next step:

Aggregate sum/min/max functions to group age/ Expenses/ region

elections-bar

This tool of aggregate the information is a very strong tool to help you to answer many business questions referent your case.

Specific in this business case I sorted the 5 top Regions that spend more money in their campaign.

And the second table calculate how old the most of the politician is in the elections of 2018.

Gender Analyses:

gender

According to the data study, 2/3 of all Brazilian politicians are men.

gender-education

Even at that point, the knowledge is very similar, just Women have 4% less Superior education than Men, that migrate to High School education.

    Programming Language:

  • Python
  • Pandas
  • Numpy
  • matplotlib.pyplot
  • Seaborn