Understanding what makes a software Free (as in freedom) has been going on since the beginning of the Free Software movement in the 80’s (at least). This led to the Free Software licenses, which help users to control the technology they use. However, considering the peculiarities of Artificial Intelligence (AI) software, one may wonder whether those licenses account for those.
Free Software licenses were designed so that users control technology, and facilitate their collaboration. Software released under a Free Software license guarantees that users can use, study, share and improve it however they want, with anybody they want. Once one accesses the source code and the accompanying license(s), he or she can run the software. Indeed, most software runs on commodity hardware. However, this is not true for AI and deep learning, the branch of AI powering most of the recent successful AI technologies.
In Artificial Intelligence, deep learning is a part of machine learning and is usually composed of 5 elements: data, a model and its parameters, the definition of a problem (in the form of a loss function) which ties the data and the model together, a training phase and an inference phase. The goal of the learning phase (training) is to modify the model’s parameters so that the model gets incrementally better at solving the problem i.e. at minimizing the loss function. Once the loss stops decreasing, the model cannot learn further and the parameters stop changing. Using those parameters, one can make predictions with data not used during the learning phase: this is the inference phase. In deep learning, those parameters are the weights of interconnected neurons which form an artificial neural network.
But here is the problem: the number of parameters used for deep learning is enormous and keeps increasing. Likewise, the amount of data is getting enormous, to a point where using deep learning on commodity hardware is no longer possible. This raises the question of what would make an AI truly Free: what is the point of an AI published as Free Software if most users cannot exercise the 4 freedoms endowed by existing Free Software definition and licenses? Even though one might access the data and the code used for training, they would not be able to train the AI, improve it and share the results. Those who can afford to train the AI (modify the weights of deep learning models) are in a very powerful position compared to those who cannot. The AI being Free Software therefore does not necessarily guarantee that users stay in control of technology. What would be required to make an AI Free Software in the sense that it allows users to control it?
A truly Free AI would need to be easy to train by their users. This requires the trained model’s parameters to be easily accessible so that they can be used as a starting point for training, rather than adjusting the parameters from scratch (usually from randomly initialized parameters). Deep learning weights should thus be Free Software. The number of parameters and the amount of data required to improve the AI software would also need to be manageable. If the data and its precise description cannot be shared, the use of Open Standard would facilitate the creation of alternative datasets.
AI is not going away. Since the rise of deep learning in the last decade, triggered by the availability of more data, improved methods for stabilizing and speeding up the training of deep neural networks and improved hardware, the use of AI is becoming more and more mainstream. And now that we start to understand how powerful the AI genie is, it cannot be put back in the bottle. This raises the question of how to stay in control of technology in a world where AI is bound to become more powerful and ubiquitous. Free Software is a key part of the answer.