Choropleth Map in ggplot2

Creating a map in ggplot2 can be surprisingly easy! This tutorial will show the US by state. The dataset is from 1970 and will show some US statistics including population, income, life expectancy, and illiteracy.

I love making maps, while predictive statistics provide such great insight, map making was one thing that really made my interested in data science. I’m also glad R provides a great way to make them.

I’d also recommend plotly package where you can make it interactive as you scroll over. All within R!

Here is the first map we will make:

This is population by state in 1970 US.

library(ggplot2)
library(dplyr)

states<-as.data.frame(state.x77)
states$region <- tolower(rownames(states))
states_map <- map_data(“state”)
fact_join <- left_join(states_map, states, by = “region”)

ggplot(fact_join, aes(long, lat, group = group))+
geom_polygon(aes(fill = Population), color = “white”)+
scale_fill_viridis_c(option = “C”)+
theme_classic()

For the next graph the code will be mostly similar but I will change the fill = option.

Let’s try per capita income:

This is great. We’re able to see the income range through the color fill of each state.

ggplot(fact_join, aes(long, lat, group = group))+
geom_polygon(aes(fill = Income), color = “white”)+
scale_fill_viridis_c(option = “C”)+
theme_classic()

Last one we’ll make is life expectancy:

Great info here! Life expectancy, in the 1970s by state! That particular variable needed a little extra coding, see below:

fact_join$`Life Exp` <- as.numeric(fact_join$`Life Exp`)

ggplot(fact_join, aes(long, lat, group = group))+
geom_polygon(aes(fill = `Life Exp`), color = “white”)+
scale_fill_viridis_c(option = “C”)+
theme_classic()

Enjoy your maps! Also this dataset is publicly available so feel free to recreate.

Pros and Cons of Top Data Science Online Courses

There are a variety of data science courses online, but which one is the best? Find out the pros and cons of each!

Coursera, EdX, etc

These MOOCs have been around for several years now and continue to grow. But are they really the best option for learning online?

Pros:

  • Lots of Topics including R and Python
  • Affordable and even a free option
  • Well thought out curriculum from professors in great schools

Cons:

  • Not easily translatable to industry
  • Not taught by current industry professionals, but instead academics

Now, these MOOCs are still worth checking out and seeing if it works for you, but beware that you may feel tired of analyzing the iris data set.

PluralSight

Pros:

  • Lots of Topics in R, Python, and databases
  • Easy to skip around through the user interface instead of going in order
  • Taught by industry veterans in top companies that know current trends and expectations
  • You can use your own apps -Anaconda and RStudio – on your computer and not in the website itself

Cons:

  • Still just a bit limited on their data courses, but still growing quickly

DataCamp

Pros:

  • Great options for beginners to intermediate
  • Courses build on each other, fairly good examples
  • Most instructors have spent time in the industry

Cons:

  • You have to use their in website coding tool
  • Exercises are not always that clear
  • Never know if your app will work the same way on your own computer

So that’s a quick overview of options for learning online. Of course blogs are fantastic, too, and stack overflow can really be helpful!

Feel free to add your recommendations, too!

Check out PluralSight’s great offer today!