GitHub - rfordatascience/tidytuesday: Official repo for the #tidytuesday project
source link: https://github.com/rfordatascience/tidytuesday
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
README.md
A weekly social data project in R
A weekly data project aimed at the R ecosystem. As this project was borne out of the R4DS Online Learning Community
and the R for Data Science
textbook, an emphasis was placed on understanding how to summarize and arrange data to make meaningful charts with ggplot2
, tidyr
, dplyr
, and other tools in the tidyverse
ecosystem. However, any code-based methodology is welcome - just please remember to share the code used to generate the results.
Join the R4DS Online Learning Community
in the weekly #TidyTuesday
event! Every week we post a raw dataset, a chart or article related to that dataset, and ask you to explore the data. While the dataset will be βtamedβ, it will not always be tidy! As such you might need to apply various R for Data Science
techniques to wrangle the data into a true tidy format. The goal of TidyTuesday
is to apply your R skills, get feedback, explore otherβs work, and connect with the greater #RStats
community! As such we encourage everyone of all skills to participate!
We will have many sources of data and want to emphasize that no causation is implied. There are various moderating variables that affect all data, many of which might not have been captured in these datasets. As such, our guidelines are to use the data provided to practice your data tidying and plotting techniques. Participants are invited to consider for themselves what nuancing factors might underlie these relationships.
The intent of Tidy Tuesday is to provide a safe and supportive forum for individuals to practice their wrangling and data visualization skills independent of drawing conclusions. While we understand that the two are related, the focus of this practice is purely on building skills with real-world data.
All data will be posted on the data sets page on Monday. It will include the link to the original article (for context) and to the data set.
We welcome all newcomers, enthusiasts, and experts to participate, but be mindful of a few things:
- The data set comes from the source article or the source that the article credits. Be mindful that the data is what it is and Tidy Tuesday is designed to help you practice data visualization and basic data wrangling in R.
- Again, the data is what it is! You are welcome to explore beyond the provided dataset, but the data is provided as a "toy" dataset to practice techniques on.
- This is NOT about criticizing the original article or graph. Real people made the graphs, collected or acquired the data! Focus on the provided dataset, learning, and improving your techniques in R.
- This is NOT about criticizing or tearing down your fellow
#RStats
practitioners or their code! Be supportive and kind to each other! Like other's posts and help promote the#RStats
community! - Use the hashtag #TidyTuesday on Twitter if you create your own version and would like to share it.
- Include a picture of the visualisation when you post to Twitter.
- Include a copy of the code used to create your visualization when you post to Twitter. Comment your code wherever possible to help yourself and others understand your process!
- Focus on improving your craft, even if you end up with something simple!
- Give credit to the original data source whenever possible.
Submitting Datasets
Want to submit an interesting dataset? Please open an Issue and post a link to the article (or blogpost, etc) using the data, then we can discuss adding it to a future TidyTuesday Event!
Submitting Code Chunks
Want to submit a useful code-chunk? Please submit as a Pull Request and follow the guide.
DataSets
2018
2019
Week Date Data Source Article 12019-01-01
#Rstats & #TidyTuesday Tweets
rtweet
stackoverflow.blog
2
2019-01-08
TV's Golden Age
IMDb
The Economist
3
2019-01-15
Space Launches
JSR Launch Vehicle Database
The Economist
4
2019-01-22
Incarceration Trends
Vera Institute
Vera Institute
5
2019-01-29
Dairy production & Consumption
USDA
NPR
6
2019-02-05
House Price Index & Mortgage Rates
FreddieMac & FreddieMac
Fortune
7
2019-02-12
Federal R&D Spending
AAAS
New York Times
8
2019-02-19
US PhD's Awarded
NSF
#epibookclub
9
2019-02-26
French Train Delays
SNCF
RTL - Today
10
2019-03-05
Women in the Workplace
Census Bureau & Bureau of Labor
Census Bureau
11
2019-03-12
Board Games
Board Game Geeks
fivethirtyeight
12
2019-03-19
Stanford Open Policing Project
Stanford Open Policing Project SOPP - arXiv:1706.05678 SOPP - arXiv:1706.05678 13
2019-03-26
Seattle Pet Names
seattle.gov
Curbed Seattle
14
2019-04-02
Seattle Bike Traffic
seattle.gov
Seattle Times
15
2019-04-09
Tennis Grand Slam Champions
Wikipedia
Financial Times
16
2019-04-16
The Economist Data Viz Mistakes
The Economist
The Economist
17
2019-04-23
Anime Data
MyAnimeList
MyAnimeList
18
2019-04-30
Chicago Bird Collisions
Winger et al, 2019
Winger et al, 2019
19
2019-05-07
Global Student to Teacher Ratios
UNESCO
Center for Public Education
20
2019-05-14
Nobel Prize Winners
Kaggle
The Economist
21
2019-05-21
Global Plastic Waste
Our World In Data
Our World in Data
22
2019-05-28
Wine Ratings
Kaggle
Vivino
23
2019-06-04
Ramen Ratings
TheRamenRater.com
Food Republic
24
2019-06-11
Meteorites
NASA
The Guardian - Meteorite map
25
2019-06-18
Christmas Bird Counts
Bird Studies Canada
Hamilton Christmas Bird Count
26
2019-06-25
Global UFO Sightings
NUFORC
Example Plots
27
2019-07-02
Media Franchise Revenues
Wikipedia
reddit/dataisbeautiful post
28
2019-07-09
Women's World Cup
data.world
Wikipedia
29
2019-07-16
R4DS Membership
R4DS Slack
R4DS useR Presentation
30
2019-07-23
Wildlife Strikes
FAA
FAA
31
2019-07-30
Video Games
Steam Spy
Liza Wood
32
2019-08-06
Bob Ross paintings
FiveThirtyEight
FiveThirtyEight
33
2019-08-13
Roman Emperors
Wikipedia / Zonination
reddit.com/r/dataisbeautiful
34
2019-08-20
Nuclear Explosions
SIPRI
Our World in Data
35
2019-08-27
Simpsons Guest Stars
Wikipedia
Wikipedia
36
2019-09-03
Moore's Law
Wikipedia
Wikipedia
37
2019-09-10
Amusement Park Injuries
Data.world & Saferparks
Saferparks
38
2019-09-17
National Park Visits
Data.world
fivethirtyeight
article
39
2019-09-24
School Diversity
NCES
Washington Post
article
40
2019-10-01
All the Pizza
Jared Lander & Ludmila Janda, Tyler Richards, DataFiniti
Tyler Richards on TWD
41
2019-10-08
Powerlifting
OpenPowerlifting.org
Elias Oziolor
42
2019-10-15
Car Fuel Economy
EPA
Ellis Hughes
43
2019-10-22
Horror movie ratings
IMDB
Stephen Follows
Useful links
Link Description π The R4DS Online Learning Community Website π The R for Data Science textbook π Carbon for sharing beautiful code pics π Post gist to Carbon from RStudio π Post to Carbon from RStudio π Join GitHub! π Basics of GitHub π Learn how to use GitHub with R π Save high-rezggplot2
images
Useful data sources
Link Description π Data is Plural collection π BuzzFeedNews GitHub π The Economist GitHub π Thefivethirtyeight
data package
π
The Upshot by NY Times
π
The Baltimore Sun Data Desk
π
The LA Times Data Desk
π
Open News Labs
π
BBC Data Journalism team
Data Viz/Science Books
Only books available freely online are sourced here. Feel free to add to the list
Link Description π Fundamentals of Data Viz by Claus Wilke π The Art of Data Science by Roger D. Peng & Elizabeth Matsui π Tidy Text Mining by Julia Silge & David Robinson π Geocomputation with R by Robin Lovelace, Jakub Nowosad, Jannes Muenchow π Data Visualization by Kieran Healy πggplot2
cookbook by Winston Chang
π
BBC Data Journalism team
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK