As Artificial Intelligence is solving increasingly hard problems, it’s becoming more and more complex. This complexity leads to an often overlooked issue: the lack of transparency. This is problematic, because by taking answers at face value from an uninterpretable model (a black box), we’re trading accuracy for transparency. This is bad for a couple of reasons: Debugging. While it may be possible to figure out what’s wrong with a car just by hearing it squealing and whirring, opening up the engine lid and inspecting everything is way more efficient.
A Recurrent Neural Network (RNN) often uses ordered sequences as inputs. Real-world sequences have different lengths, especially in Natural Language Processing (NLP) because all words don’t have the same number of characters and all sentences don’t have the same number of words. In PyTorch, the inputs of a neural network are often managed by a DataLoader. A DataLoader groups the input in batches. This is better for training a neural network because it’s faster and more efficient than sending the inputs one by one to the neural network.
Software being more and more used to get metrics and insights for critical areas of our societies such as our healthcare system, crime recidivism risk assessment, job application review or loan approval, the question of algorithms fairness is becoming more important than ever. As algorithms learn from human-generated data, they often magnify human bias in decision making, making them prone to judging something in an unfair way. For example, the Amazon CV review program was found to be unfair to women.
Wanting to brush up my PyTorchskills, I’ve started to follow thistutorial. It explains how to create a deep learning model able to predict the origin of a name. At the end of the tutorial, there’s an invitation to try to improve the model. Which I did. Note that the point of the tutorial is not to create the most performant model but rather to demonstrate and explain PyTorch’s capabilities. Here’s a comparison between the model described in the tutorial and the one I’ve built.
There’s a lot of guides explaining how to protect your online privacy, but none of them tell why they exist in the first place. They exist because privacy is understated. We don’t value it enough. Here are the reasons. Threats to privacy are not obvious Despite recent attempts to regulate online data processing (e.g the GDPR in the EU) as well as privacy breaches, it’s still not clear why all of that threatens privacy.
Despite what the bad media are saying, computers haven’t understood human language (yet). We need to turn sentences and words into a format that can be effectively manipulated by a Machine Learning or Deep Learning algorithm. This is called language modeling. Here I will explain several methods that can turn words into a meaningful representation. Integer encoding This approach is the simplest. Once we have a list of the tokens composing the vocabulary, we associate each one with an integer.
With its recent gain in popularity, a lot of things have been called “Artificial Intelligence”. But what is it anyway? According to Wikipedia, it’s “intelligence demonstrated by machines”, but does such a thing exist? At time of writing, they are 4 main types of AI development algorithms. Expert systems defines a category of computer programs that are specifically designed to do a task using prior human knowledge. Software engineers work closely with a domain expert to build the program, that will act in a predicable way, like the domain expert would have done if he or she had the same processing power.
Stochastic Gradient Descent (SGD) is used in many Deep Learning models as an algorithm to optimize the parameters (the weights if each layer). Here is how it works: At each step in the training process, the goal is to update the weights towards the optimal value. For this, SGD uses the equation: $$new\;estimate = current\;estimate - (\nabla \times learning\;rate)$$ In this equation, the gradient ∇ indicates the direction towards the solution, (above or below it) and how far we are from it.
As Deep learning is becoming more and more popular, there is an ongoing debate on whether it’s possible to create Deep Learning applications with a Free Software license. See for example this discussion on the debian-devel mailing list. The argument we often see is that: It’s impossible to study the inner workings of a Deep Learning software (for example, an image classifier or a text generator) or improve it, because one cannot understand how it’s going to make predictions only by looking at the weights of the Deep Learning model Training a Deep Learning model requires a specialized and expensive hardware that runs non-Free software But the first statement misses the point of Deep Learning programs.
If we have to choose between a convenient system and a secure one, we often pick the former rather than the latter. The reason is mainly psychological. Several scientific studies have shown that we prefer instant gratification over delayed gratification, because that’s how our brains are wired. We are surrounded by instant gratification, our day-to-day actions like our hobbies, usage of social media, got us hooked on having a quick feedback.