How to run Python in R

Learn how to run Python code inside an R script using the reticulate R package

As much as I love R, it’s clear that Python is also a great language—both for data science and general-purpose computing. And there can be good reasons an R user would want to do some things in Python. Maybe it’s a great library that doesn’t have an R equivalent (yet). Or an API you want to access that has sample code in Python but not R.

Thanks to the R reticulate package, you can run Python code right within an R script—and pass data back and forth between Python and R.

In addition to reticulate, you need Python installed on your system. You also need any Python modules, packages, and files your Python code depends on.

If you’d like to follow along, install and load reticulate with install.packages("reticulate") and library(reticulate).

To keep things simple, let’s start with just two lines of Python code to import the NumPy package for basic scientific computing and create an array of four numbers. The Python code looks like this:

import numpy as np
my_python_array = np.array([2,4,6,8])

And here’s one way to do that right in an R script:

py_run_string("import numpy as np")
py_run_string("my_python_array = np.array([2,4,6,8])")

The py_run_string() function executes whatever Python code is within the parentheses and quotation marks.

If you run that code in R, it may look like nothing happened. Nothing shows up in your RStudio environment pane, and no value is returned. If you run print(my_python_array) in R, you get an error that my_python_array doesn’t exist.

But if you run a Python print command inside the py_run_string() function such as

py_run_string("for item in my_python_array: print(item)")

you should see a result.

It’s going to get annoying running Python code line by line like this, though, if you have more than a couple of lines of code. So there are a few other ways to run Python in R and reticulate.

One is to put all the Python code in a regular .py file, and use the py_run_file() function. Another way I like is to use an R Markdown document.

R Markdown lets you combine text, code, code results, and visualizations in a single document. You can create a new R Markdown document in RStudio by choosing File > New File > R Markdown.

Code chunks start with three backticks (```) and end with three backticks, and they have a gray background by default in RStudio.

This first chunk is for R code—you can see that with the r after the opening bracket. It loads the reticulate package and then you specify the version of Python you want to use. (If you don’t specify, it’ll use your system default.)

```{r setup, include=FALSE, echo=TRUE}

This second chunk below is for Python code. You can type the Python like you would in a Python file. The code below imports NumPy, creates an array, and prints the array.

import numpy as np
my_python_array = np.array([2,4,6,8])
for item in my_python_array:

Here’s the cool part: You can use that array in R by referring to it as py$my_python_array (in general, py$objectname).

In this next code chunk, I store that Python array in an R variable called my_r_array. And then I check the class of that array.

my_r_array <- py$my_python_array

It’s a class “array,” which isn’t exactly what you’d expect for an R object like this. But I can turn it into a regular vector with as.vector(my_r_array) and run whatever R operations I’d like on it, such as  multiplying each item by 2.

my_r_vector <- as.vector(py$my_python_array)
my_r_vector <- my_r_vector * 2

Next cool part: I can use that R variable back in Python, as r.my_r_array (more generally, r.variablename), such as

my_python_array2 = r.my_r_vector

Source: infoworld