Red Hat OpenShift Data Science

Red Hat OpenShift Data Science is a part of the Red Hat OpenShift AI portfolio and provides tools across the AI/ML lifecycle.

Build smarter with OpenShift Data Science

OpenShift Data Science is a cloud service that gives data scientists and developers a powerful AI/ML platform for building intelligent applications. Data scientists and developers can collaborate to quickly move from experiment to production in a consistent environment.

Available as an add-on cloud service to Red Hat OpenShift Dedicated and Red Hat OpenShift Service on AWS or as a self-managed software product, OpenShift Data Science allows data scientists to quickly develop, train, and test machine learning models using the JupyterLab interface. After models are developed, data scientists can use GitHub integration to trigger updated builds of OpenShift applications created by developers.  

Learn more about Red Hat OpenShift  

RHODS overview image

4 reasons you'll love using Red Hat OpenShift Data Science

Red Hat OpenShift Data Science is a managed cloud service built from a curated set of components where data scientists can develop, train, and test their machine learning (ML) workloads and then deploy results in a container-ready format.

Take your development environments to the cloud and build better projects.

Read full article

Accelerate Data Science

Enabling rapid experimentation and model development on OpenShift allows developers to integrate models into their workflows with fewer obstacles.

  Tested, supported AI/ML tooling

Red Hat OpenShift Data Science provides Jupyter notebooks in the JupyterLab interface with several out-of-the-box notebook images that automatically include common libraries and packages such as TensorFlow, PyTorch, Scikit-learn, Pandas, Numpy, and others. Red Hat regularly updates the versions of the packages in these images to make it easy for you to get started and stay updated.

  Operationalize AI/ML models

With Red Hat's expertise in a Kubernetes-based application platform, we allow developers to bring experimental models to production faster, with fewer obstacles. Leveraging core OpenShift technologies like Source to Image (S2I), you can host models on the OpenShift Dedicated platform for easy testing or export them out in a containerized format for use in on-premises, edge, or cloud applications.

  Fully managed, efficient cloud service

Red Hat manages the ops infrastructure that includes not only the OpenShift Data Science service but also the full Kubernetes stack. Data scientists and developers can collaborate to develop, train, and deploy models in a common, trusted environment. Focus on application development and innovation rather than managing tickets and chasing down unsupported tools and patches.

  Choose your technology partners

Choose from a broad range of validated partner data science and ML tools, available in Red Hat Marketplace. Software and SaaS-based offerings from Starburst, Anaconda, IBM Watson, Intel, and Seldon are integrated directly into the service offering, and dozens of other partner offerings.

OpenShift Data Science is open source

Red Hat’s product development cycle has always been rooted in open source and the communities that help to steer Red Hat’s products’ direction. Like Fedora is the upstream project for Red Hat Enterprise Linux, the projects listed here are the upstream versions of products that make up Red Hat OpenShift Data Science.

Open Data Hub logomark

Open Data Hub

Red Hat OpenShift Data Science is based on the upstream project Open Data Hub, which is a blueprint for building an AI as a service platform on Red Hat's Kubernetes-based OpenShift Container Platform. Open Data Hub is a meta-project that integrates over 20 open source AI/ML projects into a practical solution. Red Hat OpenShift Data Science is a subset of the tools offered in Open Data Hub, but in a supported, managed cloud service.

Jupyter logomark

Jupyter

Project Jupyter is a project born out of the IPython Project in 2014 as it evolved to support interactive data science and scientific computing across all programming languages. Jupyter is a community of data enthusiasts who believe in the power of open tools and standards for education, research, and data analytics.

TensorFlow logomark

TensorFlow

TensorFlow is an end-to-end open source platform for machine learning. It has a comprehensive, flexible ecosystem of tools, libraries, and community resources that lets researchers push the state-of-the-art in ML, and developers easily build and deploy ML-powered applications.

Pytorch logomark

PyTorch

PyTorch is an open source machine learning framework that accelerates the path from research prototyping to production deployment. It is used for applications such as computer vision and natural language processing.

SciKit-learn logomark

Scikit-learn

Scikit-learn is a machine learning library for Python. It is built on Numpy, Scipy, and Matplotlib and offers simple and efficient tools for predictive data analysis.