Programming with R for Official Statistics: Introduction to Programming (2024)

Last updated on 2024-07-09 | Edit this page

Overview

Questions

  • What is programming?
  • What is object oriented programming?
  • How to document code?
  • What is a directory?

What is programming?

Programmers use programming languages to give instructions to theircomputers. In this course, we will learn how to use the open sourcelanguage R to complete common tasks required in the field of officialstatistics. This includes the basics of R, data manipulation, and bestpractices.

There are a few reasons why programming with R is useful for officialstatistics. Data manipulation and analysis with R is:

  • Time-saving: R can complete many computations on a large amountof data that would take a person a long time manually

  • Reproducible: This code can be re-run with other data with smallmodifications and shared with others to be applied to other newpurposes

  • Transparent: When you’ve completed a script using best practices,you should be left with a clear list of instructions to complete thedata analysis in the form of code. This avoids “black boxes” where ananalyst is unsure what they’ve done to the data to get it to it’s finalform

R is an object oriented programming language

Object oriented programming languages use objects as theirmain tools. These objects have classes, which describe theirgeneral properties. For example, in R you might work withnumeric objects, which would contain numbers. You could alsowork with characters, which would be composed of text. We’llexplore classes and data types thoroughly in Episode 3 (Data Types andStructures). We can assign “labels” to these objects, creating avariable and use them interchangeably. We assign objects withan assignment operator. In R, the most commonly used assignment operatoris <-. Try reproducing the example below on your machineby entering the code into the console and hitting the “run” button.

R

# Assign a number to a variablenumber_flowers <- 8# Print the variable's contentsprint(number_flowers)

We can get the value stored within the variable by printing it.

[1] 8

Assigning a new value to a variable breaks the connection with theold value; R forgets that number and applies the variable name to thenew value.

When you assign a value to a variable, R only stores the value, notthe calculation you used to create it. This is an important point ifyou’re used to the way a spreadsheet program automatically updateslinked cells. Let’s look at an example.

# Reassign the variablenumber_flowers <- 7

{: .language-r)

OUTPUT

[1] 7

Variable Naming Conventions

Historically, R programmers have used a variety of conventions fornaming variables. The . > character in R can be a validpart of a variable name; thus the above assignment could have easilybeen weight.kg <- 57.5. This is often confusing to Rnewcomers who have programmed in languages where . has amore significant meaning. Today, most R programmers 1) start variablenames with lower case letters, 2) separate words > in variable nameswith underscores, and 3) use only lowercase letters, underscores, andnumbers in variable names. The Tidyverse Style Guide includes asection on thisand other style considerations.

Documenting Code

Notice that in the above examples, hashtags (#) are usedbefore giving instructions that are intended for you rather than R.Hashtags produce comments, which are handy for leavinginformation about the code that will follow. Commenting as much code aspossible is part of best practices. Always comment your code! You owe itto your colleagues who may see your code (not to mention your futurecoding self).

# Hashtags go before commented code, which is not run# print("This code will not be run")print("Always comment your code!")

OUTPUT

[1] "Always comment your code!"

Directories

A directory is a location on your machine. Say you’d like to open afile that’s located in a folder on your computer. We need to tell Rwhere to look for the file if we expect to find it. Directories areusually listed by referencing nested folders separated by slashes. Thereare small differences due to operating system (OS), so refer todocumentation specific to your OS when learning to work with folderstructures.

For example: /Users/Documents/Learning-R points to afolder called “Learning-R” in a user’s documents folder. Depending onyour IDE (Integrated Development Environment) and setup, you can printyour current directory, known as the working directory. Rautomatically reads and writes files from and to your current workingdirectory.

R

# print current working directory getwd()

OUTPUT

[1] "/Users/Documents/

Before beginning our lessons, please set your working directory tothe folder that we created in the setup section withsetwd(). For example, if your folder is namedLearning-R:

R

# change current working directory setwd("~/Documents/Learning-R")

Key Points

  • Programming makes our work faster, more reproducible, and moretransparent.
  • R is an object oriented programming language
  • Document your code with comments
  • A working directory is the active location on your computer where Rcan read and write files
Programming with R for Official Statistics: Introduction to Programming (2024)

FAQs

Is statistics with R difficult? ›

Learning R can be tough, especially for beginners. Let's explore why many struggle and how to overcome these challenges. R's unique syntax and steep learning curve often surprise new learners. Its complex data structures and error messages can be overwhelming, particularly for those new to programming.

Is R programming easy or hard? ›

R is considered one of the more difficult programming languages to learn due to how different its syntax is from other languages like Python and its extensive set of commands. It takes most learners without prior coding experience roughly four to six weeks to learn R. Of course, this depends on several factors.

Can I learn R with no programming experience? ›

Though it helps to have basic computer skills and knowledge, you can enroll in a beginner level course to gain the necessary knowledge to use R in your career. You may also be able to succeed in R courses without having much experience in data science.

How long does it take to learn R statistics? ›

Brand new programmers may take six weeks to a few months to become comfortable with the R language. Three months is generally enough time for any new programmer to use the language and start applying it in their professional life. By setting a goal with Pluralsight's Skills app, you learn at your own pace.

Is R easier or harder than Python? ›

Both Python and R are considered fairly easy languages to learn. Python was originally designed for software development. If you have previous experience with Java or C++, you may be able to pick up Python more naturally than R. If you have a background in statistics, on the other hand, R could be a bit easier.

Can I pass statistics if I'm bad at math? ›

While taking a statistics class terrifies some students, one need not be a mathematical or statistical genius to pass this class. Read on to discover tips and strategies that will help you pass statistics.

Is R programming a dying language? ›

In conclusion, the predictions of the death of the R programming language are premature. R continues to demonstrate its expertise, authority, and relevance in the domains of data analysis, statistical computing, data science, and software development.

Is R harder than Excel? ›

Most people already learned the basics of Microsoft Excel in school. Once the data has been imported into an Excel sheet, using a point-and-click technique we can easily create basic graphs and charts. R, on the other hand, is a programming language with a steeper learning curve.

Which is hard R or Python? ›

Python vs R : Comparison
CriteriaPythonR
Ease of LearningEasyModerate
VersatilityStrongLimited
StatisticsGood (with libraries)Excellent
Data VisualizationGood (with libraries)Excellent (ggplot2)
3 more rows
May 10, 2024

Can I get a job with only R programming? ›

Although it's essential to look at some different programming careers and the languages they use regularly, R will open opportunities for you to pursue a career in several data analytics and statistics-based positions, such as data scientist, data analyst, data architect, statistician, or data engineer.

Can you learn R in 3 days? ›

If you have experience in any programming language, it takes 7 days to learn R programming spending at least 3 hours a day. If you are a beginner, it takes 3 weeks to learn R programming. In the second week, learn concepts like how to create, append, subset datasets, lists, join.

Does R programming pay well? ›

The average salary for R users in this industry is around $73,000 annually. The average salary of a data scientist in India's Marketing & Advertising field is ₹ 9.8 Lakhs per year. An R programmer can earn up to $86,000 per annum in the financial services market.

How do I get started with R statistics? ›

No one starting point will serve all beginners, but here are 6 ways to begin learning R.
  1. Install , RStudio, and R packages like the tidyverse. ...
  2. Spend an hour with A Gentle Introduction to Tidy Statistics In R. ...
  3. Start coding using RStudio. ...
  4. Publish your work with R Markdown. ...
  5. Learn about some power tools for development.

Is it better to learn R or SPSS? ›

SPSS vs. R for Data Analysis When it comes to data analysis, choosing between SPSS and R depends on your specific needs and preferences. If you're a beginner or need to perform basic statistical analyses, SPSS may be a better choice.

Can I learn R on my own? ›

Can I learn R on my own? Of course, you can. In fact,many working programmers don't have a computer science degree and have learned how to program outside of college. While many programming jobs do require a degree, it does not have to be in computer science.

Why is R difficult to learn? ›

When working with R, you will need to recall more commands, making it much harder to learn and use than other languages. Another factor contributing to the difficulty of learning R is that this language has inconsistent analysis ways when more than one variable is present.

Is R or Stata easier? ›

R is a programming language that allows you to go beyond what Stata can achieve. If you have a basic understanding of coding or are familiar with the coding environment. Stata, on the other hand, should be preferred over R if you have little or no coding experience.

Is statistics harder than calculus? ›

If you enjoy analyzing trends and drawing conclusions from data, you may find AP Statistics less daunting and more interesting. On the other hand, AP Calculus can be relatively more challenging because it covers more advanced mathematical concepts, such as derivatives, integrals, and limits.

Should I learn statistics before learning R? ›

Key Insights

R is considered challenging to master due to its unique syntax and extensive set of commands, but can be learned with dedication and the right resources. A solid understanding of statistics, data science concepts, and data analytics can make learning R programming easier.

Top Articles
Latest Posts
Article information

Author: Gregorio Kreiger

Last Updated:

Views: 5796

Rating: 4.7 / 5 (77 voted)

Reviews: 84% of readers found this page helpful

Author information

Name: Gregorio Kreiger

Birthday: 1994-12-18

Address: 89212 Tracey Ramp, Sunside, MT 08453-0951

Phone: +9014805370218

Job: Customer Designer

Hobby: Mountain biking, Orienteering, Hiking, Sewing, Backpacking, Mushroom hunting, Backpacking

Introduction: My name is Gregorio Kreiger, I am a tender, brainy, enthusiastic, combative, agreeable, gentle, gentle person who loves writing and wants to share my knowledge and understanding with you.