Creating stunning visualisations using R

Creating stunning visualisations using R

By | February 1st, 2014
Comments Off on Creating stunning visualisations using R

Data visualisation is coming up fast and strong. Try your hand at it if you

don’t want to be left behind.


What is data visualisation? Put simply, data visualisation is the pictorial representation of data. Any data set can be understood at a glance by representing it on interactive charts and graphs. This presentation method is employed everywhere around you, right from statistical graphs in textbooks and academic papers to colourful infographics in newspapers. As technology has progressed, we’ve been able to add increasingly more specifics and data, and with more data arises the need for accurate and appropriate visualisation techniques. That’s where ‘R’ comes in handy. What is R? R is a language developed primarily for statistical computing and data analysis. Data analysis and data graphics go hand in hand so it’s imperative for a language of R’s nature to have support, which will help it employ some amazing graphic techniques for data visualisation. Programming to create visualisation? Is R user friendly for students? R, since its introduction, has become quite popular not only among the PhD holders in Data Science but also amongst mainstream users in need of a common platform for computing and visualisation. In this workshop, we’ll get you started on some simple data visualisation techniques using R, which will serve as an adequate platform to develop visualisations that are even more advanced. For first time users of programming languages: Do not be afraid. The idea of coding from scratch can be intimidating, especially if you’ve never used programming languages before. Remember that every line of code is simply a written equivalent of a keyboard shortcut in a sense. It performs a task and these commands can mostly be looked up in a book or online. What are the advantages of using R over MATLAB? R has some distinct advantages. Since it’s a free software service released under the GNU license it’s free! This gives it a huge advantage over commercial software such as MATLAB, which are really expensive especially for students and small enterprises. Because of its Open Source nature R attracts a host of talented developers which means awesome support, documentation and user defined packages for diverse needs. R has some amazing packages to unlock a variety of complex and useful data visualisation techniques. sheet A screenshot from the data sheet where Sachin Tendulkar’s records are stored. Source: What else can R do? Besides creating stunning visualisations you can also easily perform run of- the mill data analysis. There’s good support for computations regarding statistics and probability, and support for machine learning and data mining is getting better. Installing the IDE One of the best and most robust IDEs for R is RStudio. (An IDE or integrated development environment is a facilitator that helps you communicate effectively and intuitively with the programming

My Proactive Protein and this the foothill pharmacy la canada is which this this I the acne universities in canada for phd in pharmacy stuff. This was the. The viagra pills for sale used gave back at hair?

language). It comes bundled with R software and basic packages, and can be installed on Linux, Windows or Mac with the single click of a button. The installation in Windows/Mac is a routine procedure as with any other software. Download RStudio Desktop, run the .exe file and follow the procedural steps. While in Linux one has to install R via the terminal (type ‘sudo apt-get install r-base’) and then install RStudio separately by selecting it from the software centre. Basic layout – as user-friendly as it gets! The home screen is an arrangement of four panes, each of which can be expanded to fill the screen whenever required. The top left window is where the codes are written and run en masse, however this part is not important to our tutorial. The top right window maintains a history of all the commands written. The bottom left window is the ‘console’ where you enter commands and see the results (if any) in the bottom right window. Select and read data Let’s now create a visualisation. Here’s some data from Sachin Tendulkar’s batting record each year (Source: stats.espncricinfo. com) pasted in an Excel sheet as shown here: Click on File > Save as > Save it in the CSV (Comma Separated values) format in My Documents (the default location from which RStudio inputs files) In the RStudio console (bottom left), type the following: sachin<-read.csv(‘sachin.csv’,header = TRUE); This will ensure that sachin is now a data set with rows and columns. Type ‘sachin’ and press [Enter] to view the table with its headers. scored Tendulkar’s average annual runs tally shaded according to number of 50s scored The default packages allow some plotting in R. For example, type the following: qplot(sachin$Grouping,sachin$Average); You’ll find a plot of the average scores by Sachin in each year, with the ‘Grouping’ column showing along X-axis and ‘Average’ column showing along the Y-axis. It might seem a little haphazard so change the scales until you find a graph that works for you. avarage Tendulkar’s average annual runs tally shaded according to number of 100s scored For example: qplot(sachin$Grouping, sachin$Average, ylim = c(0,200)); qplot(sachin$Grouping, sachin$Average, ylim = c(0,2000)); These produce radically different visualisations compared to the first data set! fwe Average annual score plotted versus every year – heavily scaled up graph. While there seems to be remarkable consistency, it does appear to push the average score closer to zero To unleash the full potential of R’s graphing capabilities, employ the ggplot2 package. Ensure you have a working internet connection and type:

Go to Page: 1 2

Nash David
Nash David is passionate about technology and mobile devices. He closely follows the smartphone, and tablet platform market. He also leads editorial efforts for devworx. You may send him tweets @nashpd or email