[Advice] Careers in Deep Learning

Neo Ar

#10235

January 6, 2017

Background
So for those of you who don't know, for the last handful of months I've been doing deep learning related R&D at the company I'm working for. As someone who stumbled into this industry due to an interest in the subject and a company that had a use for it, I have some advice for people here who might be interested in pursuing a job in the field.

First of all, you'll never have to worry about being out of work. Deep Learning is incredibly hot right now, and it's a fast growing field. The core idea behind convolutional neural networks goes back some time, but it was only in 2012 that this field really began when a convolutional neural network won an image recognition competition for the first time. We are now seeing big-name companies scrambling to acquire deep learning start-ups.

I am interested in this subject, but how do I learn?

Get familiar with Python and Numpy: https://cs231n.github.io/python-numpy-tutorial/
Optionally study subjects listed as prerequisites for the Stanford cs231n course: http://cs231n.stanford.edu
Take cs231n (lectures materials)
Optionally seek out study material for deep learning frameworks such as the udacity tensorflow course

If you are in a company that wants to get into deep learning and you are able to choose the tech, don't assume that tensorflow is the right tool for the job. There is a lot of hype around tensorflow and from an outsiders perspective researching what to use, it looks like the way to go. In reality, there are a number of deep learning frameworks all with their pros and cons, and it really depends on the task you are trying to solve with deep learning.

What companies are a good fit for a handmade programming advocate?
Disclaimer: I don't work for the companies I am listing here, don't know what it is like internally
You want to check out Nervana. These guys are crazy. You'll notice they were acquired by Intel last year. Their "neon" deep learning framework is currently the fastest around. These guys looked at CUDA and said "nah, we need more performance," and reverse engineered nvidia cards/drivers to hand tailor optimized assembly routines for their framework. They also are building some crazy hardware. NVIDIA is another good option. We think of NVIDIA as a gpu company marketed towards gamers, who historically have driven that industry, but these days it is deep learning that is driving innovation. NVIDIA now describes themselves as follows: "NVIDIA is the AI computing company. The GPU, our invention, is the engine of computer graphics and GPU deep learning has ignited modern AI."

This is simply my perspective after being immersed in the subject for the past ~3 months. Hope it is useful information for anyone interested in the subject and looking for a job.

Jeremiah Goerdt

#10236

January 6, 2017

It seems like a really exciting field. If I weren't looking for a programming job right now, I think you would have given me something new to try out.

That being said, I'm going to keep an eye out for deep learning opportunities over the next few years. Thanks

Shazan

#10239

January 6, 2017

I am also interested in this field any homework I should do?
What is the best book to study about deep learning?
What is different between programming and deep learning?What mindset should I have?
Any qualifications to get jobs?

Mārtiņš Možeiko

#10240

January 6, 2017

miotatsu already provided answers to few of your questions - he listed few online lectures you can take which includes few homeworks.

What is different between programming and deep learning?

That is a strange question to ask. What is difference between programming and game development in your opinion? Its just that programming is a way how you do neural networks. There are not orthogonal concepts.

Qualifications? Be good with development and understand basics of neural networks. Its the same a with any other job - know what you are doing, be interested in what you want to do, and don't stop learning about it.

Shazan

#10300

January 9, 2017

I stumbled upon this microsoft datascience .

Is worth doing it and paying for it.

Although it's got microsoft written all over it.

Neo Ar

#10320

January 10, 2017

Not familiar with the Microsoft Professional Program. It doesn't look like it is about deep learning, but a background in data science certainly wouldn't hurt. Speaking of Microsoft though, they do seem to have a good research group. Microsoft Research was behind the residual network architecture that is currently the state-of-the-art in deep learning.

Everything in my post I have first hand experience with. I started out googling around to see what the best tool for the job was for the task my company is interested (can't give details due to NDA). From an outsider's perspective, TensorFlow looked like the way to go. For that reason, I took the Udacity deep learning course which uses TensorFlow. I found cs231n thanks to a blog post from Karpathy

I can't stress enough how great of a resource cs231n is. They have you use python with numpy rather than a full-blown deep learning framework, and cover a lot of important information, relevant research, give historical background and some information regarding the relationship between the techniques used in deep learning and the brain, lots of practical advice, etc. I don't have any book recommendations, and if there are books on the subject they are surely already out of date as new papers are constantly being published and major advancements being made. Did I mention that this has only been a thing since 2012? :P

Regarding the relationship of programming and deep learning - you implement a deep neural network via code, just like any other programming task. There exist numerous frameworks that make this task easier, and they all have pros and cons. Most everything uses python, but one framework in particular (Torch) uses Lua. You could of course implement this stuff in any language you like, you'll just have more work ahead of you if you want to build from scratch in C or something.

That said, the concepts are largely pretty simple, and at least in Python the amount of code you actually need to do really interesting things is very small. You can be training an existing model via transfer learning on a new data-set and get impressive results with a very small amount of code.

One thing you'll find in this field is that they like to use a lot of big fancy terms to sound smart, when what they mean is quite simple. For example, you'll see the term ReLU thrown around a lot (Rectified Linear Units). ReLU literally just means

max(0, x)

- a trivial clamp.