Here, R and Python code for the same purpose are paired for your consideration. They are strikingly similar.
Expressions and Statements
Comment
R and Python has the same grammar for writing comment.
# This is comment.
R and Python has the same grammar for console output.
print('the same code for r and python again')
Arithmetic
# R
2 / 7 # 0.285714...
2 %/% 7 # 0
2 ^ 7
2 %% 7
rep('a', 3) # 'aaa'
# Python
2 / 7 # 0.285714...
2 // 7 # 0
2 ** 7
2 % 7
'a' * 3 # 'aaa'
Console Help on a Function
The same grammar(command grammar) again.
?max
Import library
# R
library(dplyr)
# Python
import pandas as pd
from math import radians as rad
Iterations
# R
for(x in 2:7){ ... }
for(x in dic){ ... }
for(i in 1:nrow(df)){ ... }
# Python
for i, x in enumerate(2:7): # index and value simultaneously!
...
for k, v in dic.items(): # loop over dictionary
...
for label, row in df.iterrows(): # loop over DataFrame
...
Method call
- R: like function
- Python: after
.
Like many other OOP languages, Python uses .
to call member function of a class. However, In R, .
has no special meaning. you can use .
in variable or function names. Member function is called like other top-level functions.
Data Structures
Basic Types
- R: logical(boolean), numeric, integer, character, vector, matrix, data.frame
- Python: bool, float, int, str
c.f. Python utility functions: type()
c.f. R support names everywhere
List
List creation, slicing, access, append and delete code:
# R
a <- list('a', 1, 'b', 2)
a[1] # list('a')
a[-1] # list(1, 'b', 2)
a[1:2] # list('a', 1)
a[[1]] # 'a'
b <- list(a, a) # list(list('a', 1, 'b', 2), list('a', 1, 'b', 2))
b[[1]][[2]] # 1
b[[1]][2] # list(1)
a <- c(a, list('c', 3)) # append
a[5:6] <- NULL # delete
# Python
a = ['a', 1, 'b', 2]
a[0] # 'a'
a[-1] # 2
a[0:2] # ['a', 1]
b = [a, a] # [['a', 1, 'b', 2], ['a', 1, 'b', 2]]
b[0][1] # '1'
a <- a + ['c', 3] # append
del(a[-2:-1) # delete
Note that R always performs slicing, while Python mixes indexing and slicing (like R vector).
Vector, Matrix
Vector types | Matrix types | |
---|---|---|
R | vector | matrix |
Python | numpy.array, pandas.Series | numpy.array (2D) |
# R
a <- c(1,2,3) # vector creation by combine function
a[2] # 2
b <- matrix(1:6, ncol = 2, byrow=T) # matrix creation
b[2, 2] # 4
# Python
import numpy as np
a <- np.array([1, 2, 3])
a[1] # 2
b <- np.array([1, 2], [3, 4], [5, 6])
b[1, 1] # 4
Dictionary (Key-Value Store / Map / Symboltable)
- R: named data structures(list, vector, etc) do
- Python: Dictionary type exists
# R
dic <- list(a = 1, b = 2)
dic['a'] # list(a = 1)
dic[['a']] # 1
dic$a # 1
dic$c <- 3 # list(a = 1, b = 2, c = 3)
dic$b <- NULL # list(a = 1, c = 3)
# Python
dic = { 'a': 1, 'b': 2 }
dic['a'] # 1
dic['c'] = 3 # { 'a': 1, 'b': 2, 'c': 3 }
del(dic['b']) # { 'a': 1, 'c': 3 }
Dataset (Table)
Dataset types | |
---|---|
R | data.frame |
Python | pandas.DataFrame |
# R
df <- data.frame(id = c('A', 'B'), age = c(12, 13))
rownames(df) <- 1:2
df <- read.csv('data.csv')
df$age # c(12, 13) i.e. vector
data.frame(age = df$age) # single column data.frame
# c.f. dplyr select
df[, 1:2] # data.frame with 2 columns
df[1, ] # 1st row
df[1:2, 1:2] # 1~2 rows, 1~2 columns
# Python
import pandas as pd
df = pd.DataFrame({ 'id': ['A', 'B'], 'age': [12, 13] })
df.index = [1, 2]
df = pd.read_csv('data.csv')
df['id'] # pandas Series [12, 13] i.e. vector
df[['id']] # pandas DataFrame with single column
df[['id', 'age']] # 2 columns
df[0] # 1st row
df.iloc[0, 0] # 'A'
df.iloc[[0, 1], [0, 1]] # 1~2 rows, 1~2 columns
# c.f. df.iloc[:, [0, 1]]
# c.f. df.loc for character names
Function
Function Definition
# R
#' Double input
#'
#' @param x numeric. number to be doubled
#'
#' @return numeric
#' @export
#'
#' @examples el.inv(iris[1:4,1:4])
#'
ftn <- function(x){
x * 2
}
# Python
def ftn(x):
# Double input"
return x * 2;
Anonymous Function
# R
function(x, y){x + y}
# Python
lambda x, y: x + y;
Function Mapping
# R
sapply(df$id, tolower) # 'a', 'b'
library(purrr)
df$id %>% map(tolower) # 'a', 'b'
# Python (by pandas)
df['id'].apply(str.lower) # 'a', 'b'
c.f. anonymous function using formula
# R
library(purrr)
xs %>% map(~ . * 2) # xs * 2
Statistics
Vector Statistic
- R: mean, sd, cor, …
- Python: numpy.mean, numpy.std, numpy.corrcoef, …
Plot Basic
# R
plot(1:10, type='l') # line plot
plot(1:10) # scatter plot (default)
# c.f. points(1:10)
hist(runif(10)) # histogram
# Python
import matplotlib.pyplot as plt
plt.plot(range(1, 11)) # line plot
plt.show() # show panel
plt.clf() # clear panel
plt.scatter(range(1, 11), range(1, 11))
plt.show()
plt.clf()
import numpy as np
plt.hist(np.random.rand(10))
plt.show()