fbpx
RIDE – A New Data Science IDE for Python and R RIDE – A New Data Science IDE for Python and R
The data science world is split into two parts: the (i)Python and the R community. Both groups offer a plethora of... RIDE – A New Data Science IDE for Python and R

The data science world is split into two parts: the (i)Python and the R community. Both groups offer a plethora of tools and libraries enriching our work-life as a data scientist.

Interestingly, many of the offerings are complementary, such that professional data scientists should know both environments to pick the right tool for the job. In many cases, it even makes sense to use Python and R together in the same project.

Sadly, today these two worlds don’t integrate very well, so we need to switch back and forth between different tools and environments.

Introducing RIDE

RIDE is a new development environment for data science. It aims to be the cockpit for professional data scientists working in multiple languages. By leveraging and extending the awesome JupyterLab, RIDE combines advanced tool support with the interactivity of Jupyter notebooks.

Interactivity

We are firm believers in instant feedback and quick development turnarounds. RIDE provides feedback to the user all the time. This starts with features like intellisense and diagnostics that update as you type and goes much further.

(Inline) Sourcing

In code editors, you can send code of current line for evaluation to the active session. RIDE’s flexible layouts allows you to see editors and consoles side-by-side, like in the following example.

In case, you do not want to waste screen estate with a console, you can enable “inline sourcing”. Doing so will render the results right below the statement in the editor:

Furthermore, you can source the whole file explicitly or automatically on save.

Notebooks, RMarkdown, Shiny and More

Jupyter Notebooks are natively supported because RIDE directly leverages the new awesome JupyterLab. Additionally, RIDE can handle other presentation formats very well. For instance, RMarkdown can be auto-processed on save.

Of course, you can use the “inline sourcing” feature in *.rmd files as well.

Furthermore, users can start, stop, and refresh Shiny apps within an editor containing a app.

Language Support

A data model ultimately is a piece of software written in one or more programming languages. Such projects can quickly grow and become complex, such that traditional software engineering practices like testing and debugging become necessary. Why should data scientists not get the same kind of advanced language and tool support, that regular software developers have today?

Language support in RIDE goes beyond what the usual data science tools such as RStudio or StatET offer today. We get our inspiration from the best tools in software like JetBrain’s IntelliJ IDEA or the Java support in Eclipse.

RIDE supports the new “Language Server Protocol” (LSP), through which you get many useful features in code editors, consoles, and notebooks. Such features are

  • intellisense
  • diagnostics
  • navigation
  • hovers
  • and many more

Supporting such features in a dynamically typed language such as Python or R is, of course, a bit more challenging than with a statically-typed language such as TypeScript, Scala or Java. Traditionally such tool support is provided by a compiler that parses the source code and collects information through type systems and static analysis. For R we have implemented such a language server, but additionally, we combine it with information from a running kernel.

Debugging

When working in a kernel session, the user wants to see what values are available. RIDE offers an environment view where you can navigate through the current scope and inspect any values.

Furthermore, you can debug your code by setting breakpoints and stepping through it. Again RIDE’s support for that is not tied to R but is based on a generic kernel extension and will soon be supported for Python and other languages, too.

Please see this article for a more detailed description and comparison of RIDE’s debug support.

Data Viewer

Data Science is all about data. So of course, we need to be able to have a glimpse at it now and then. The challenge here is that often we process large amounts of data and we need to be careful not to inflate the memory footprint unnecessarily. Since RIDE is a cloud service, you could scale up your workspace if needed but still we cannot send all data over the network. Moreover, all the available memory should be used in meaningful ways and not wasted carelessly.

Therefore RIDE’s data viewer will only fetch the data it needs to present. In fact, it even allows looking at an infinite stream of data. Since the data viewer directly connects to the kernel through a protocol extension, no unnecessary copying of data happens.

Next Steps

Today, RIDE provides all these features for R already. For Python some features, such as debugging, are still in development and will follow soon. Furthermore, we are going to implement a SQL Kernel using the same protocols.

Furthermore, we will keep working closely with the awesome JupyterLab team, who were not only very open to our (sometimes quite extensive) pull-request but also generally always super friendly and supportive in all kinds of ways. We are currently looking into making even more of the things we did open-source. For instance, it would make sense to open up the kernel extensions we have defined, to allow third parties to use them as well.

If you got interested in trying out RIDE, you can create a free account and start using it now.

 

Originally posted by Sven Efftinge, VP Technology at r-brain  r-brain.io

RBrain

R-Brain provides an integrated cloud/on-premises data science platform for developing models with popular open source languages. Powered by Jupyter, our IDE, console, notebook and markdown are all integrated into one environment with full language support for R and Python. R-Brain editor is built with Monaco, the heart of VS code. With Docker technology and prebuilt images, R-Brain empowers data scientists with quick setup, instant collaboration and version control at workspace level. Our mission is to make model development an enjoyable and victorious experience by providing a computational environment that delivers utmost flexibility, integration and reliability.

1