Assignment: Find 3 Datasets
Short summary: Find 3 public datasets and post them to bl.ocks.org as forks this Block: Data Table Summary.
You will be required to do a data visualization project for this class. The project will involve finding a dataset to use that is sufficiently complex, creating simple visualizations of the dataset throughout the course, and building an interactive version of it towards the end of the course (either dashboard with multiple linked views, or a "scrollytelling" or story-based narrative).
For this assignment: Find and describe 3 datasets that you’d like to potentially visualize for your project.
- The term "dataset" here means: a data table relating to a certain topic, issue, or situation. Often a "dataset" could also mean a collection of multiple related tables, but usually there is one "main" or "primary" table (e.g. the "fact table" of a star schema) that is the largest in the collection. That "main" table is the one that needs to meet the criteria below.
Each of these 3 datasets must meet the following criteria:
-
Publicly available (not proprietary), with no strict licensing/usage restrictions
- If you want to work on a project related to your company or work, this is fine, as long as you use a dataset that has been vetted/approved for public release (e.g. anonymized).
- Must be available in CSV or JSON format
- If you find a nice dataset that's not in these formats originally, but you convert it into one of these formats, then it's fine to use.
- Any Excel spreadsheet can be exported to CSV.
- For HTML tables (like in Wikipedia), you can convert to CSV by copy-pasting into a Google Sheet, then exporting to CSV.
- Must have 4 or more columns of data
- Must have between 100 and 10,000 rows of data
-
Total file size (in CSV or JSON format) must be less than 5MB
- if it's larger, but you have processed it down to this size, then it's fine to use.
For each of these 3 datasets you find, please do the following:
- Fork this Block "Data Table Summary" http://blockbuilder.org/curran/f849f374f21c490c5490d501636bdb77
- Delete the existing "data.csv" file in Blockbuilder.
- Upload your dataset in a D3-readable format (CSV, TSV, or JSON) as a new file in Blockbuilder.
- Change the code so that your new data file is loaded and summarized.
- In the README, update the link after "This data is from ..." to link to the source of your dataset.
- In the README, update the text after "This dataset is about ..." to add a brief (a sentence or several) description of the dataset.
Please submit the three URLs of your datasets in bl.ocks.org.
After submission, I will review these 3 datasets and give feedback whether or not each would be suitable to work with for your project.
Ideas for data sources:
- The datasets behind the visualizations shared in the class Slack
- The datasets used for your bar chart assignments
- Data is Plural (Structured Archive)
- https://github.com/curran/data