dplyrdata wrangling, data analysisThe essential data-munging R package when working with data frames. Especially useful for operating on data by categories. CRAN.See the intro vignetteHadley Wickhampurrrdata wranglingpurrr makes it easy to apply a function to each item in a list and return results in the format of your choice. It’s more complex to learn than the older plyr package, but also more robust. And, its functions are more standardized than base R’s apply family — plus it’s got functions for tasks like error-checking. CRAN.map_df(mylist, myfunction)
More: Charlotte Wickham’s purr tutorial video, the purrr cheat sheet PDF download, easy error checking with purrr’s possibly.Hadley Wickhamreadxldata importFast way to read Excel files in R, without dependencies such as Java. CRAN.read_excel(“my-spreadsheet.xls”, sheet = 1)Hadley Wickham

readr and vroomdata importBase R handles most of these functions; but if you have huge files, these packages offer faster and standardized way to read CSVs and similar files into R. readr has been around for awhile; vroom is a speedier alternative, useful for larger data sets. Eventually the packages may merge. data.table’s fread() is another useful alternative. CRAN.read_csv(myfile.csv) or vroom(myfile.csv)Hadley Wickham (readr), Jim Hester (vroom)riodata import, data exportrio has a good idea: Pull a lot of separate data-reading packages into one, so you just need to remember 2 functions: import and export. CRAN.import(“myfile”)Thomas J. Leeper & otherstidyxldata import, data wranglingIf you’ve ever wanted to tear your hair out over an Excel file with merged cells, data in column headers, headers mixed in data, and key information in color coding, this is the package for you. Each cell is imported in its own row, with information about data type, position, and color, not just value, allowing you to reshape the data from there. Super time saver for messy data. CRAN.xlsx_cells(“my_nightmare_file.xlsx”)Duncan GarmonswayHmiscdata analysisThere are a number of useful functions in here. Two of my favorites: describe, a more robust summary function, and Cs, which creates a vector of quoted character strings from unquoted comma-separated text. Cs(so, it, goes) creates c(“so”, “it”, “goes”). CRAN.describe(mydf)
Cs(so, it, goes)Frank E Harrell Jr & othersdatapastadata importData copy and paste: Meet reproducible research. If you’ve copied data from the Web, a spreadsheet, or other source into your clipboard, datapasta lets you paste it into R as an R object, with the code to reproduce it. It includes RStudio add-ins as well as command-line functions for transposing data, turning it into markdown format, and more. CRAN.df_paste() to create a data frame, vector_paste() to create a vector.Miles McBainsqldfdata wrangling, data analysisDo you know a great SQL query you’d use if your R data frame were in a SQL database? Run SQL queries on your data frame with sqldf. CRAN.sqldf(“select * from mydf where mycol > 4”)G. Grothendieckjsonlitedata import, data wranglingParse json within R or turn R data frames into json. CRAN.myjson <- toJSON(mydf, pretty=TRUE)
mydf2 <- fromJSON(myjson)Jeroen Ooms & othersXMLdata import, data wranglingMany functions for elegantly dealing with XML and HTML, such as readHTMLTable. CRAN.mytables <- readHTMLTable(myurl)Duncan Temple Langhttrdata import, data wranglingAn R interface to http protocols; useful for pulling data from APIs. See the httr quickstart guide. CRAN.r <- GET(“http://httpbin.org/get”)
content(r, “text”)Hadley Wickhamquantmoddata import, data visualization, data analysisEven if you’re not interested in analyzing and graphing financial investment data, quantmod has easy-to-use functions for importing economic as well as financial data from sources like the Federal Reserve. CRAN.getSymbols(“AITINO”, src=”http://www.computerworld.com/FRED”)Jeffrey A. Ryantidyquantdata import, data visualization, data analysisAnother financial package that’s useful for importing, analyzing and visualizing data, integrating aspects of other popular finance packages as well as tidyverse tools. With thorough documentation. CRAN.aapl_key_ratios <- tq_get(“AAPL”, get = “key.ratios”)Matt Danchorvestdata import, web scrapingWeb scraping: Extract data from HTML pages. Inspired by Python’s Beautiful Soup. Works well with Selectorgadget. CRAN.How to import data into R or see the SelectorGadget vignetteHadley Wickhamtidyrdata wranglingtidyr initially won me over with specialized functions like fill (fill in missing columns from data above) and replace_na. But now I also use it for its main purpose too: helping you change data row and column formats from “wide” to “long”. CRAN.See my YouTube video How to reshape data with tidyr’s new pivot functions.Hadley Wickhamsplitstackshapedata wranglingThe package’s cSplit() function solves a rather complex shaping problem in an astonishingly easy way. If you have a data frame column with one or more comma-separated values (think a survey question with “select all that apply”), this is worth an install if you want to separate each item into its own new data frame row.. CRAN.cSplit(mydata, “multi_val_column”, sep = “,”, direction = “long”).Ananda Mahtomagrittrdata wranglingThis package gave us the %>% symbol for chaining R operations, but it’s got other useful operators such as %<>% for mutating a data frame in place and and . as a placeholder for the original object being operated upon. CRAN.mydf %<>% mutate(newcol = myfun(colname))Stefan Milton Bache & Hadley Wickhamvalidatedata wranglingIntuitive data validation based on rules you can define, save and re-use. CRAN.See the introductory vignette.Mark van der Loo & Edwin de JongetestthatprogrammingPackage that makes it easy to write unit tests for your R code. CRAN.See the testing chapter of Hadley Wickham’s book on R packages.Hadley Wickhamdata.tabledata wrangling, data analysisPopular package for heavy-duty data wrangling and computation. While I often prefer dplyr for basic analysis, data.table has become my go-to for large data sets or when speed is critical (such as in Shiny apps). CRAN.data.table in 5 minutes video, The ultimate data.table cheat sheet, Intro vignetteMatt Dowle & othersstringrdata wranglingNumerous functions for text manipulation. Some are similar to existing base R functions but in a more standard format, including working with regular expressions. Some of my favorites: str_pad and str_trim. CRAN.str_pad(myzipcodevector, 5, “left”, “0”)Hadley Wickhamlubridatedata wranglingEverything you ever wanted to do with date arithmetic, although understanding & using available functionality can be somewhat complex. CRAN.mdy(“05/06/2015”) + months(1)
More examples in the package vignetteGarrett Grolemund, Hadley Wickham & othersDataExplorerdata analysisNot sure where to get started looking at a data set? Want to get a basic handle on that data without running multiple commands like str() and plot()? DataExplorer attempts to offer one-click report generation to show and visualize basics about a data set, such as distributions and missing data. CRAN.create_report(mydataframe)Boxuan Cuizoodata wrangling, data analysisRobust package with a slew of functions for dealing with time series data; I like the handy rollmean function with its align=right and fill=NA options for calculating moving averages. CRAN.rollmean(mydf, 7)Achim Zeileis & otherstsboxdata wrangling, data analysisSuper easy way to convert data between different R time-series data formats: xts, data frame, zoo, tsibble, and more. Plus some basic analysis functions. CRAN.ts_zoo(mydf)Christoph Saxknitr and rmarkdowndata displayAdd R to a markdown document and easily generate reports in HTML, Word and other formats. A must-have if you’re interested in reproducible research and automating the journey from data analysis to report creation. CRAN.See the Minimal Examples knitr page and RStudio’s R Markdown page.Yihui Xie & others (knitr), RStudio (rmarkdown)remedydata displayRStudio add-in offers a menu for R Markdown formatting commands, so you no longer need to remember and/or type code for things like making an HTML list or embedding a YouTube video. While WYSIWYG editing is now available for R Markdown, this add-in still has benefits: Its commands can be assigned custom keyboard shortcuts, so you can create your own shortcuts for tasks like bolding text. GitHub.See the package website.Colin Fay & othersymlthisdata displayAnother useful RStudio add-in for R Markdown, this helps you generate YML headers with proper format. GitHub.See the package website.Malcolm Barrett & Richard IannoneofficeRdata displayImport and edit Microsoft Word and PowerPoint documents, making it easy to add R-generated analysis and visualizations to existing as well as new reports and presentations. CRAN.my_doc <- read_docx() %>%
body_add_img(src = myplot)
The package website has many more examples.David Gohellistviewerdata display, data wranglingWhile RStudio has since added a list-viewing option, this HTML widget still offers an elegant way to view complex nested lists within R. GitHub timelyportfolio/listviewer.jsonedit(mylist)Kent RussellDT and reactabledata displayCreate a sortable, searchable table in one line of code with either of these R packages CRAN.DT::datatable(mydf)
reactable::reactable(mydf): Quick interactive HTML tables
reactable: reactable: Create tables with expandable rowsDT: RStudio
reactable: Gregg Linggplot2data visualizationPowerful, flexible and well-thought-out dataviz package following ‘grammar of graphics’ syntax to create static graphics, but be prepared for a steep learning curve. CRAN.qplot(factor(myfactor), data=mydf, geom=”bar”, fill=factor(myfactor))
See my searchable ggplot2 cheat sheet and
time-saving code snippets.Hadley Wickhampatchworkdata visualizationEasily combine ggplot2 plots and keep the new, merged plot a ggplot2 object. plot_layout() adds ability to set columns, rows, and relative sizes of each component graphic. GitHub.plot1 + plot2 + plot_layout(ncol=1)Thomas Lin Pedersenggforcedata visualizationAdds some design functionality to base ggplot2 including easy labeling of plot groups. CRAN.See this blog post by RStudio’s Edgar Ruiz for several useful examples.Thomas Lin Pedersenplotlydata visualizationR interface to the Plotly JavaScript library that was open-sourced in late 2015. Basic graphs have a distinctive look which may not be for everyone, but it’s full-featured, relatively easy to learn (especially if you know ggplot2) and includes a ggplotly() function to turn graphs created with ggplot2 interactive. CRAN.d <- diamonds[sample(nrow(diamonds), 1000), ]
plot_ly(d, x = carat, y = price, text = paste(“Clarity: “, clarity), mode = “markers”, color = carat, size = carat)Carson Sievert & othersggiraphdata visualizationAnother way to make ggplot2 plots interactive using geom functions such geom_bar_interactive() that include arguments for tooltips and JavaScript onclick events. CRAN.g <- ggplot(mpg, aes( x = displ, y = cty, color = drv) )
my_gg <- g + geom_point_interactive(aes(tooltip = model), size = 2) %>%
ggiraph(code = print(my_gg), width = .7).
Easy interactive ggplot graphs in R with ggiraphDavid Gohelesquissedata visualizationThis RStudio add-in offers a drag-and-drop interface for ggplot2. And, it generates codes for the graph you create with the GUI. It’s a useful tool for exploring different color palettes and themes, even if you’re comfortable creating your visualizations directly in R. CRAN.See examples on the project’s website .Victor Perrier and Fanny Meyer, dreamRsdygraphsdata visualizationCreate HTML/JavaScript graphs of time series – one-line command if your data is an xts object. CRAN.dygraph(myxtsobject)JJ Allaire & RStudioecharts4rdata visualizationRobust R wrapper for the echarts JavaScript library. CRAN.mydata %>%
e_charts(xcol)%>% e_line(ycol)
Plot in R with echarts4r or the package siteJohn Coene

tauchartsdata visualizationThis html widget library is especially useful for scatterplots where you want to view multiple regression options. However, it does much more than that, including line and bar charts with legends and tooltips. GitHub hrbrmstr/taucharts.See the author’s post on RPubsBob RudisRColorBrewerdata visualizationNot a designer? RColorBrewer helps you select color palettes for your visualizations. CRAN.See Jennifer Bryan’s tutorialErich Neuwirthpaletteerdata visualizationThis package is a collection of dozens of R color palettes, all with a common interface. Helpful if you want to move beyond built-in and RColorBrewer options.Make the most of R colors and palettes or see the paletteer package site for examples on accessing palettes and using them with ggplot2.Emil Hvitfeldtsfmapping, data wranglingThis package makes it much easier to do GIS work in R. Simple features protocols make geospatial data look a lot like regular data frames, while various functions allow for analysis such as determining whether points are in a polygons. A GIS game-changer for R. CRAN.See the package vignettes, starting with the introduction, Simple Features for R.Edzer Pebesma & othersleafletmappingMap data using the Leaflet JavaScript library within R. GitHub rstudio/leaflet.See my tutorial or the package websiteRStudio

tidygeocodermappingThis is my new geocoding go-to. It supports more than a dozen different geocoding services and returns results in an immediately usable tibble format. Plus it offers reverse and batch geocoding as well as getting lat/long for an address.mydata %>%; geocode(address_col, method = ‘osm’, lat = latitude , long = longitude)
See the getting started vignette Jesse Cambon & otherstmap & tmaptoolsmappingThis package offer an easy way to read in shape files and join data files with geographic info, as well as do some exploratory mapping. Recent functionality adds support for simple features, interactive maps and creating leaflet objects. Plus, tmaptools::palette_explorer() is a great tool for picking ColorBrewer palettes. CRAN.See the package vignette or my mapping in R tutorialMartijn Tennekescolourpickerdata visualizationThe package’s RStudio add-in makes it easy to browse through and select R’s built-in colors, or get hex codes for custom colors not available by name. The plotHelper() function lets you select colors and see how they’d look on a scatter plot. CRAN.See the GitHub repo.Dean Attalimapsapimapping, data wranglingThis interface to the Google Maps Direction and Distance Matrix APIs let you analyze and map distances and driving routes. CRAN.google_directions( origin = c(my_longitude, my_latitude),
destination = c(my_address),
alternatives = TRUE
Also see the vignetteMichael Dormantidycensusmapping, data wranglingWant to analyze and map U.S. Census Bureau data from 5-year American Community Surveys or 10-year censuses? This makes it easy to download numerical and geospatial info in R-ready format. You can also use it to download US, state, or local shapefiles for other use. CRAN.See Basic usage of tidycensus and the author’s online book Analyzing US Census Data.Kyle E. WalkeralbersusamappingDo you need to make a map of the US with Alaska and Hawaii insets? This package offers one of the simplest ways to get a well-designed shapefile. GitHub hrbrmstr/albersusa.us_sf <- usa_sf(“lcc”) %>%
mutate(State = as.character(name))
See the package GitHub repo.Bob Rudisgluedata wranglingMain function, also glue, evaluates variables and R expressions within a quoted string, as long as they’re enclosed by {} braces. This makes for an elegant paste() replacement. CRAN.glue(“Today is {Sys.Date()}”)Jim HestergoogleanalyticsRWeb analyticsPull data from Google Analytics, including GA’s version 4 API. Also has anti-sampling options. CRAN.See package website.Mark Edmonsonroxygen2package developmentUseful tools for documenting functions within R packages. CRAN.How to write an R package or the roxygen2 introductory vignette.Hadley Wickham & othersshinydata visualizationTurn R data into interactive Web applications with this framework for R. CRAN.See the tutorial or my Create a Shiny app to search TwitterRStudioshinyjsdata visualizationIncludes several functions to add sophistication to your UI such as hiding and showing inputs and reset input values. Was initially published with commercial use restrictions but now is under an MIT license.See the get started guideDean Attaliflexdashboarddata visualizationIf Shiny is too complex and involved for your needs, this package offers a simpler (if somewhat less robust) solution based on R Markdown. CRAN.More info in Using flexdashboardJJ Allaire, RStudio & othersopenxlsxmiscIf you need to write to an Excel file as well as read, this package is easy to use and offers a lot of options for formatting your spreadsheet. CRAN.write.xlsx(mydf, “myfile.xlsx”)Alexander Walkergmodelsdata wrangling, data analysisThere are several functions for modeling data here, but the one I use, CrossTable, simply creates cross-tabs with loads of options — totals, proprotions and several statistical tests. CRAN.CrossTable(myxvector, myyvector, prop.t=FALSE, prop.chisq = FALSE)Gregory R. Warnesjanitordata wrangling, data analysisBasic data cleaning made easy, such as finding duplicates by multiple columns, making R-friendly column names and removing empty columns. It also has some nice tabulating tools, like adding a total row, as well as generating tables with percentages and easy crosstabs. And, its get_dupes() function is an elegant way of finding duplicate rows in data frames, either based on one column, several columns, or entire rows. CRAN.tabyl(mydf, sort = TRUE) %>% adorn_totals(“row”)Samuel Firke

scalesdata wranglingWhile this package has many more sophisticated ways to help you format data for graphing, it’s worth a download just for the comma(), percent() and dollar() functions. CRAN.comma(mynumvec)Hadley WickhamprofvisprogrammingIs your R code sluggish? This package gives you a visual representative of your code line by line so you can find the speed bottlenecks. CRAN.profvis({ your code here })Winston Chang & otherstidytexttext miningElegant implementation of text mining functions using Hadley Wickham’s “tidy data” principles. CRAN.See tidytextmining.com for numerous examples.Julia Silge & David Robinsondiffobjdata analysisBase R’s identical() function tells you whether or not two objects are the same; but if they’re not, it won’t tell you why. diffobj gives you a visual representation of how two R objects differ. CRAN.diffObj(x,y)Brodie Gaslam & Michael B. AllenprophetforecastingI don’t do much forecasting analysis; but if I did, I’d start with this package. CRAN.See the Quick start guide.Sean Taylor & Ben Letham at FacebooktidymodelsforecastingFor considerably more robust forecasting, check out this suite of modeling packages. Somewhat steep learning curve. CRAN.See the Getting started guide or this workshop repo from Thomas Mock.Numerousarrowdata import, data exportR implementation of the cross-language platform for in-memory data with columns. Includes functions to read and write Parquet and Feather files as well as CSV and JSON. CRAN.write_feather(mydf, tempfile())Neal Richardson & othersfstdata import, data exportAnother alternative for binary file storage (R-only), fst was built for fast storage and retrieval, with access speeds above 1 GB/sec. It also offers compression that doesn’t slow data access too much, as well as the ability to import a specific range of rows (by row number). CRAN.write.fst(mydf, “myfile.fst”, 100)Mark KlikgoogleAuthRdata importIf you want to use data from a Google API in an R project and there’s not yet a specific package for that API, this is the place to turn for authenticating CRAN.See examples on the package website and this gist for use with Google Calendars. CRAN.Mark Edmondsondevtoolspackage development, package installationdevtools has a slew of functions aimed at helping you create your own R packages, such as automatically running all example code in your help files to make sure everything works. Requires Rtools on Windows and XCode on a Mac. CRAN.run_examples()Hadley Wickham & othersremotespackage installationremotes is a lighter-weight alternative to devtools if all you want is to install packages from GitHub, Bitbucket and some other sources. CRAN.install_github(“mangothecat/franc”)Gabor Csardi & othersgithubinstallpackage installationDo you want to install a package from GitHub but can’t remember the creator’s name — or just don’t feel like typing it out? With githubinstall, simply run githubinstall(“packagename”) and the function will suggest an account; you just respond Y to install or n if it’s the wrong one. It even includes fuzzy matching if you misspell a package name!githubinstall(“AnomalyDetection”)Koji MakiyamainstallrmiscWindows only: Update your installed version of R from within R. On CRAN.updateR()Tal Galili & others

usethispackage development, programmingInitially aimed at package development, usethis now includes useful functions for any coding project. Among its handy features are an edit family that lets you easily update your .Renvironment and .Rprofile files. On CRAN, but install GitHub version from “r-lib/usethis” for latest updates.edit_r_environ()Hadley Wickham, Jennifer Bryan & RStudioheremiscThis package has one function with a single, useful purpose: find your project’s working directory. Surprisingly helpful if you want your code to run on more than one system. CRAN.my_project_directory <- here()Kirill Müllerpacmanmisc, package installationThis package is another that aims to solve one problem, and solve it well: package installation. The main functions will loadi a package that’s already installed or installing it first if it’s not available. While this is certainly possible to do with base R’s require() and an if statement, p_load() is so much more elegant for CRAN packages, or p_load_gh() for GitHub. Other useful options include p_temp(), which allows for a temporary, this-session-only package installation. CRAN.p_load(dplyr, here, tidycensus)Tyler Rinkerplumberdata export, programmingTurn any R function into a host-able API with a line or two of code. This well-thought-out package makes it easy to use R for data handling in other, non-R coding projects. CRAN.See the documentation or my article Create your own Slack bots — and Web APIs — with RJeff Allen, Trestle Technology & othersdataCompareRdata wranglingA quick and elegant way to compare two data frames, either row by row or by a specified key. CRAN.rCompare(mydf1, mydf2)Rob Noble-Eddy at CapitalOne & otherscloudyR projectdata import, data exportThis is a collection of packages aimed at making it easier for R to work with cloud platforms such as Amazon Web Services, Google and Travis-CI. Some are already on CRAN, some can be found on GitHub.See the list of packages.Variousflyiodata import, data exportThis is a bit like rio, but for the cloud: It offers a common set of functions whether you’re using Amazon’s S3 or Google Cloud. Set your data source, authenticate with your credentials (which can be stored in an R environmental variable), set a bucket name, and off you go. GitHub.See the GitHub repo or YouTube video of a demo at the Delhi useR meetup.SocialCopsgeofacetdata visualization, mappingWhile I rarely need to create “geofacets” — maps with same-sized blocks in geospatially appropriate locations — this package is so cool that I had to include it. The package lets you create your own geofacet visualizations using ggplot2 and built-in grids such as US states and EU countries. And, it comes with design-your-own geofacet grid capabilities. CRAN.grid_design()Ryan HafenreticulateprogrammingIf you know Python as well as R, this package offers a suite of tools for calling Python from within R, as well as “translating” between R and Python objects such as Pandas data frames and R data frames. CRAN.See How to run Python in R or the reticulate package website.JJ Allaire

beeprmiscThis is pretty much pure fun. Yes, getting an audible notification when code finishes running or encounters an error could be useful; but here, the available sounds include options like a fanfare flourish, a Mario Brothers tune, and even a scream. CRAN.beep(“wilhelm”)Rasmus Bååth

Source link


Please enter your comment!
Please enter your name here