Exercise Set 05: Good and Bad Visualizations

BEE 4850/5850, Fall 2024

Due Date

Friday, 2/23/24, 9:00pm

You can find a Jupyter notebook in the exercise’s Github repository. You should feel free to clone the repository and switch the notebook to another language, or to download the relevant data file(s) and solve the problems without using a notebook. In either of these cases, if you using a different environment, you will be responsible for setting up an appropriate package environment.

Regardless of your solution method, make sure to include your name and NetID on your solution PDF for submission to Gradescope.

Overview

Instructions

The goal of this exercise is for you to find and evaluate data visualizations which you think do a particularly good and bad job.

Load Environment

The following code loads the environment and makes sure all needed packages are installed. This should be at the start of most Julia scripts.

import Pkg
Pkg.activate(@__DIR__)
Pkg.instantiate()

The following packages are included in the environment (to help you find other similar packages in other languages). The code below loads these packages for use in the subsequent notebook (the desired functionality for each package is commented next to the package).

using DataFrames # tabular data structure
using CSVFiles # reads/writes .csv files
using Plots # plotting library
using StatsBase # statistical quantities like mean, median, etc
using StatsPlots # some additional statistical plotting tools

Problem

Find an example of a data visualization (could be from any reasonable source: journalism, a scientific paper, generated from data) that you think does a particularly good job of communicating something about the underlying data, and one which does a particularly bad job. Write a brief summary (one paragraph) for each about what the visualization is trying to communicate and what makes it (in)effective. Make sure to include where I can find a raw version of the figure (or if you generated it yourself) which is higher-resolution for use in the class discussion next week.

Overview

Instructions

Load Environment

Problem

References