Andreas' Blog

Adventures of a software engineer/architect

Automated dev workflow for using Data Science VM on Azure

2018-03-22 11 min read anoff
tl;dr; I put together a bunch of scripts on Github that let you deploy a VM from your command line as well as sync code from your local directory to the VM easily to be able to use local IDE and git but execute on the powerful remote machine. Perfect for Data Science applications based around jupyter notebook. In my previous blog post I explained how to do Terraform deployment of an Azure Data Science Virtual Machine. Continue reading

Deploy Datascience infrastructure on Azure using Terraform

2018-01-23 6 min read anoff
In this article I will talk about my experience building my first infrastructure deployment using Terraform that does (a little) more than combining off-the-shelf resources. The stack we will deploy 📦 Lately I’ve been looking at a lot of Microsoft Azure services in the big data area. I am looking for something to replace a Hadoop based 🐘 data analytics environment consisting mainly of HDFS, Spark & Jupyter. The most obvious solution is to use a HDInsight cluster which is basically a managed Hadoop that you can pick in different flavours. Continue reading