




























































































Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
An overview of R, a popular data science language. Discover its history, features, advantages, and uses in data analysis and business decision-making. R is a free alternative to MATLAB, Excel, SAS, and SPSS, and is widely used by organizations for data manipulation, calculation, visualization, and statistical analysis. It is also compatible with other data processing technologies like Hadoop and Spark.
Typology: Slides
1 / 109
This page cannot be seen from the preview
Don't miss anything!
Presented By
Shanti lal Bhayal
Assistant Professor
Programming
Environment
What is data science?
Hacking ( Programming) +
Maths/Statistics + Domain
Knowledge = Data Science
SO NEXT WHAT IS
Data Scientist?
A data scientist is simply a
person who can
write code = in R, Python,Java,
SQL, Hadoop (Pig,HQL,MR) etc
= for data storage, querying,
summarization, visualization
= how efficiently, and in time
(fast results?)
= where on databases, on
cloud, servers
and understand enough
statistics
to derive insights from data
so business can make
decisions
► (^) Early 1990s: The development of R began.
► (^) August 1993: The software was announced on the S-news mailing list.
► (^) www.r-project.org/mail,html
► (^) June 1995: After some persuasive arguments by Martin Mächler - code available as
“free software,” under the FSF’s GNU GPL, Version 2.
► (^) Mid-1997: The initial R Development Core Team was formed(Core group)
► (^) February 2000: The first version of R, version 1.0.0, was released.
► (^) R : Past and Future History (r-project.org);
https://cran.r-project/doc/html/interface98-paper/paper.html
What's great about R?
CRAN Packages By Date (r-project.org) https://cran.r-project.org/web/packages/
R can perform various data analysis and data science tasks for free
Interactive Visualization with Shiny package (Equivalent SAS Product : Visual
Analytics)
Ensemble Learning / Machine Learning (SAS Product : SAS Enterprise Miner)
Text / Social Media Mining (SAS Product : SAS Text Miner)
Optimization and Forecasting (SAS Product : SAS ETS, PROC OPTMODEL)
RStudio IDE (SAS Product : SAS Enterprise Guide)
Integartion: Tableau, SQL Server, VS , PowewrBI
The system saves data sets between sessions, so you don't need to reload them
each time. It saves your command history too.
Where is R used?
Big data demands of companies
analyse user behaviour.
online advertising and e-commerce
Weather services use it for weather
forecasts.
It is a fundamental tool for analytics-driven
organizations
What is R?
R is a dialect of S.
So in 1988, the system was rewritten in the C language and to make it more
portable across systems and it began to resemble the system that we have
today. Historical Notes
In 1993 Bell Labs gave a corporation called StatSci which became Insightful
Corporation, an exclusive license to develop and sell the S language.
In 2004, Insightful purchased the S language completely from Lucent for $2 million
is the current owner.
In 2006, Alcatel purchased Lucent Technologies and it's now called Alcatel-Lucent.
Insightful sell its implementation of the S language under the product name S-PLUS
and has built a number of fancy features(GUI Mostely) on top of it- ”PLUS”.
1991: It was created in New Zealand by two gentleman named Ross Ihaka
and Robert Gentleman.
1993: First announcement to public.
1995: Martin Michler convinced Ross and Robert to use, to license R under the
GNU General Public License to make R free software.
1996: A Public mailing list is created(R-help and R-devel).
1997: The core group is formed. The core group control the source code of R
2000: R 1.0.0 Version is released.
R version 4.2.1 (Funny-Looking Kid) has been released on 2022-06-23.
What is R
R is an integrated suite of software facilities for data manipulation, calculation and
graphical display
An effective data handling and storage facility,
A suite of operators for calculations on arrays, in particular matrices,
A large, coherent, integrated collection of intermediate tools for data analysis,
Graphical facilities for data analysis and display either directly at the computer or on hardcopy, and
A well developed, simple and effective programming language (called ‘S’) which includes
conditionals, loops, user defined recursive functions and input and output facilities.