Data Mashups in R
How do you use R to import, manage, visualize, and analyze real-world data? With this short, hands-on tutorial, you learn how to collect online data, massage it into a reasonable form, and work with it using R facilities to interact with web servers, parse HTML and XML, and more. Rather than use canned sample data, you’ll plot and analyze current home foreclosure auctions in Philadelphia.
This practical mashup exercise shows you how to access spatial data in several formats locally and over the Web to produce a map of home foreclosures. It’s an excellent way to explore how the R environment works with R packages and performs statistical analysis.
- Parse messy data from public foreclosure auction postings
- Plot the data using R’s PBSmapping package
- Import US Census data to add context to foreclosure data
- Use R’s lattice and latticeExtra packages for data visualization
- Create multidimensional correlation graphs with the pairs() scatterplot matrix package
About the Author
Jeremy Leipzig is a bioinformatics software developer at DuPont Crop Genetics. He has conducted academic research in viral integration, metagenomics, schizophrenia, and alternative splicing. While a graduate student, he developed one of the first faculty-review websites and wrote “Work Issues in Software Engineering”, a survey-based study of “death march” projects.
Xiao-Yi Li is a biostatistician with an M.Sc. from University of Michigan. In fact, her entire education experience has be revolving statistics, a percentile or otherwise. Currently, she works in the bioinformatics group at DuPont as a statistical consultant. Her work consists mostly of design of experiments and analysis for phenotypic screens, quality control in microarrays, and association mapping.
- Paperback: 40 pages
- Publisher: O’Reilly Media (March 2011)
- Language: English
- ISBN-10: 1449303536
- ISBN-13: 978-1449303532