Not familiar with the Microsoft Professional Program. It doesn't look like it is about deep learning, but a background in data science certainly wouldn't hurt. Speaking of Microsoft though, they do seem to have a good research group. Microsoft Research was behind the residual network architecture that is currently the state-of-the-art in deep learning.
Everything in my post I have first hand experience with. I started out googling around to see what the best tool for the job was for the task my company is interested (can't give details due to NDA). From an outsider's perspective, TensorFlow looked like the way to go. For that reason, I took the Udacity deep learning course which uses TensorFlow. I found cs231n thanks to a
blog post from Karpathy
I can't stress enough how great of a resource cs231n is. They have you use python with numpy rather than a full-blown deep learning framework, and cover a lot of important information, relevant research, give historical background and some information regarding the relationship between the techniques used in deep learning and the brain, lots of practical advice, etc. I don't have any book recommendations, and if there are books on the subject they are surely already out of date as new papers are constantly being published and major advancements being made. Did I mention that this has only been a thing since 2012? :P
Regarding the relationship of programming and deep learning - you implement a deep neural network via code, just like any other programming task. There exist numerous frameworks that make this task easier, and they all have pros and cons. Most everything uses python, but one framework in particular (
Torch) uses Lua. You could of course implement this stuff in any language you like, you'll just have more work ahead of you if you want to build from scratch in C or something.
That said, the concepts are largely pretty simple, and at least in Python the amount of code you actually need to do really interesting things is very small. You can be training an existing model via transfer learning on a new data-set and get impressive results with a very small amount of code.
One thing you'll find in this field is that they like to use a lot of big fancy terms to sound smart, when what they mean is quite simple. For example, you'll see the term ReLU thrown around a lot (Rectified Linear Units). ReLU literally just means
- a trivial clamp.