Making computers more similar to the human brain is most probably one of the most major challenges facing us in the 21st century. We expect computers to begin talking, comprehend and provide solutions to problems of all kinds. There is now a rising demand for computers to be able to see and identify images. After being blind for too long, now our smartest computers can finally begin to see their outside world. Deep learning is making this truly revolutionary advance very possible indeed.
First: Machine learning
Despite the science fictional and sometimes creepy nature of the name, it is actually quite easy to understand the concept of machine learning. It is all about training algorithms on huge databases in order to make these machines capable of predicting results from the endless stream of new incoming data.
For instance: We have the diameter of a tree and we are looking for a way to predict it’s age. The database we have for this task has only three types of information: input (x, being the diameter of the tree), output (y, being the age of the tree) and finally features (a, b: regarding the type, location in forest and other such data about our tree). This information can be linked together by a simple linear function we all learned back in middle school (or maybe even earlier for some readers): y = ax + b. As we begin to train the database, algorithms powered by machine learning technology are capable of comprehending to relationship between x and y, and then begin to define the precise feature values. When our device has completed this initial training phase, computers will now be capable of forecasting the correct age of the tree (y) from any given diameter (x).
To be honest this description can be quite simplistic, but it’s the best way to start with. When we begin discussions about machines getting involved with image recognition, we find ourselves entering a totally new ballpark.
While our brain has learned to understand and process an image in our mind, a computer can only see and process millions of pixels when looking at a picture. This can be a huge amount of data that needs processing, and far too many inputs to place inside an algorithm formula. Here a shortcut is what researchers were in need of. The first solution they needed was how a machine was to understand and analyze intermediary characteristics.
For example, imagine attempting to teach your computer how to understand an image of your dog. For a human, we have to outline all the main characteristics of a dog: a pointy head and jaw, two ears that are sharp up or fall in different directions, a long nose and … When we are done defining the key features, a neural network of algorithms will, if trained correctly and thoroughly, analyze with an acceptable accuracy level and go on to determine if the image is of a dog or not.
Here experts are needed to employ features extracting algorithms and input this data into a neural network for a computer to be able to understand the image before it is actually a dog.
Now, let’s take this one level higher with a more complicated image.
For example, how would one be able to define a particular type of clothing, say a women’s dress, to a computer?
Enter the first basic machine learning limitation involved with recognizing images. This is often very difficult for ourselves as humans to define discriminating characteristics capable of reaching 100% potential in regards to recognition.
Deep learning is key to see and learn free of humans
To tackle this difficult task experts have in concept returned to the drawing board for a very simple evaluation, asking themselves how small children start learning the names of objects. What enables them to recognize and differentiate between an image of a dog and that of a woman’s dress? This is not something taught by parents through showing various characteristics. However, mothers and fathers start by naming an object or an animal each and every time their child comes before it. They patiently train their children through the visual examples technique. So, can we use the same concept to make computers capable of learning? Why not.
Two issues have remained, however: computing power and database availability, two currently limited factors in comparison to what is needed by scientists to materialize their ambitions.
Firstly, where can we find a database large enough to actually begin teaching computers the art of seeing? To solve this dilemma experts have resorted to various initiatives, such as the Image Net project launched back in 2007 under Stanford AI/Vision Lab Director Fei-Fei Li. They were able to create the largest image database in the world two years later in 2009 by collaborating with above 50,000 people in 180 different countries throughout the globe. This database contained a whopping 15 million images, with their names and classifications determined, and covering categories across 22,000 categories of all kinds.
Computers have gained the capability of training themselves to comprehend enormous databases filled with images in order to define and extract key features. And all this is done without any intervention by a human being. Very similar to the brain of a three-year old child, computers sees nothing but millions of images with names, going on to comprehend the main characteristic of each image. And most probably even other items in the near future. The algorithms, consisting of complicated feature-extraction technology, are able to employ neural networks, and yet are in need of literally billions of nodes.
All this is truly only the beginning of deep learning for computers. The industry has been able to advance towards making computers capable of analyzing image data similar to a three-year-old human child. However, as experts have pointed out, ahead of us is the very significant challenge of making computers capable of advancing from learning like a three-year-old to the capabilities of a young teenager, and of course, far beyond that afterwards.
And yet keep in mind in this piece we have not even discussed the ethical aspects of this new trending technology, and the possible sensitivities and even threats involved with computers around us becoming more and more capable. Will we reach a day where a computer can disobey an order from its human master? All this proves the very complicated and also exciting future of deep learning.