Course Outline

Introduction

  • Overview of Horovod features and concepts
  • Understanding the supported frameworks

Installing and Configuring Horovod

  • Preparing the hosting environment    
  • Building Horovod for TensorFlow, Keras, PyTorch, and Apache MXNet
  • Running Horovod

Running Distributed Training

  • Modifying and running training examples with TensorFlow
  • Modifying and running training examples with Keras
  • Modifying and running training examples with PyTorch
  • Modifying and running training examples with Apache MXNet

Optimizing Distributed Training Processes

  • Running concurrent operations on multiple GPUs    
  • Tuning hyperparameters
  • Enabling performance autotuning

Troubleshooting

Summary and Conclusion

Requirements

  • An understanding of Machine Learning, specifically deep learning
  • Familiarity with machine learning libraries (TensorFlow, Keras, PyTorch, Apache MXNet)
  • Python programming experience

Audience

  • Developers
  • Data scientists
 7 Hours

Number of participants


Price per participant

Upcoming Courses