Lynne West

thisLynneWest@gmail.com

The Question

How do carbon dioxide emissions vary among vehicle classes?

To answer this question, I used data on over 32,000 gasoline-fueled vehicles with model years ranging from 1984 to 2019. No electrical or alternative-fuel vehicles were included in the analysis. The data was obtained from fueleconomy.gov.

The Results

The Process

The first step, after the data was imported into a SQL Server database, was to combine redundant vehicle classes. For example, I combined the vehicle classes ‘Vans, Passenger Type’ and ‘Vans, Cargo Type’ into one class called ‘Vans’. The complete list of vehicle classes I combined using SQL can be seen here.

Next, I reviewed the data and decided that a boxplot would represent the data well. The box part of the boxplot shows the interquartile range (IQR), which is the spread of the middle 50% of the data. The lines that extend out from the box show the spread of the rest of the data, the lines end at the minimum and maximum values found in the data.

Last, I used Python to access the SQL Server database I created, retrieve the data I needed, and make the boxplot visualization. The code for that can be seen here.