As Deep learning is becoming more and more popular, there is an ongoing debate on whether it’s possible to create Deep Learning applications with a Free Software license. See for example this discussion on the debian-devel mailing list.
The argument we often see is that:
- It’s impossible to study the inner workings of a Deep Learning software (for example, an image classifier or a text generator) or improve it, because one cannot understand how it’s going to make predictions only by looking at the weights of the Deep Learning model
- Training a Deep Learning model requires a specialized and expensive hardware that runs non-Free software
But the first statement misses the point of Deep Learning programs. We should not treat deep learning programs as the “regular” ones. A regular program contains a set of tasks the computer has to do. The human has the knowledge of how the tasks that should be completed. But this is not true for Deep Learning. The software is not the set of actions that solve the problem, it is the set of instructions used to learn how to solve it. So the Deep Learning program is not the knowledge (the weights) used to perform the mission, it’s how to guide computers to that knowledge. In a way, this is similar to the compilation of a large program to assembly. The compilation output is hardly readable and editable, but the program can easily be studied and analyzed. The same goes for Deep Learning if we consider the model weights as the compilation output. They are not meant to be edited by hand.
There are some problems that can be solved better by computers if we explain them how to learn, because they can take into account a lot more parameters than us.
The argument that Deep Learning software can’t be Free because users can’t improve them is becoming less and less true, as technical improvements keep going. For example, the folks at fast.ai managed to train Imagenet (a very large dataset) in 18 minutes, and it costs about $40 to do it. That was done on the Amazon cloud. It’s still hard to reproduce this at home though, but it’s easier to train Deep Learning models than ever, and at some point it will be possible to achieve good results on common hardware. Another working solution that enables people to build complex models on a lot of data is crowdfunding. For example, the Leela Zero project reimplemented the AlhaGo Zero paper, released it as a Free Software and created an infrastructure which allows the distribution of the training on a lot of clients. On common hardware, training the model from Alphago Zero would take 1700 years, but by sharing the computation, they managed to vastly speed up the process and get significant results.
The major problem with Free Software and Deep Learning currently is more about the frameworks and libraries available to talk to the graphic card. The most used one is the Cuda toolkit, and is a non-free software developed by Nvidia. We need to address that quickly.