Machine Learning: What You Need to Know
Machine learning is the application of artificial intelligence (AI) which enables systems to automatically learn and improve from experience and data supplied without being explicitly programmed. It focuses on the development of computer programs that can access data and use it learn for themselves.
Machine Learning is the scientific study of algorithms and statistical models that computer systems use to effectively perform a specific task by relying on patterns and inference rather than explicit instructions. It is seen as a subset of artificial intelligence. Machine learning algorithms build a mathematical model of sample data, known as “training data”, in order to make predictions or decisions without being explicitly programmed to perform the task. Machine learning algorithms are used in a wide variety of applications, such as email filtering, and computer vision, where it is infeasible to develop an algorithm of specific instructions for performing the task. Machine learning is closely related to computational statistics, which focuses on making predictions using computers.
In its application across business problems, machine learning is also referred to as predictive analytics. This is because mathematical optimization delivers methods, theory and application domains to the field of machine learning. Also there is the field of machine learning which focuses on exploratory data analysis through unsupervised learning. This is refered to as data mining.
History and relationships to other fields
Arthur Samuel, an American pioneer in the field of computer gaming and artificial intelligence, coined the term “Machine Learning” in 1959 while at IBM. As a scientific endeavour, machine learning grew out of the quest for artificial intelligence.Already in the early days of AI as an academic discipline, some researchers were interested in having machines learn from data. They attempted to approach the problem with various symbolic methods, as well as what were then termed “neural networks”; these were mostly perceptrons and other models that were later found to be reinventions of the generalized linear models of statistics.
Probabilistic reasoning was also employed, especially in automated medical diagnosis.
Machine learning, reorganized as a separate field, started to flourish in the 1990s. The field changed its goal from achieving artificial intelligence to tackling solvable problems of a practical nature. It shifted focus away from the symbolic approaches it had inherited from AI, and toward methods and models borrowed from statistics and probability theory. It also benefited from the increasing availability of digitized information, and the ability to distribute it via the Internet.
Machine learning tasks
Machine learning tasks are classified into several broad categories
• supervised learning: the algorithm builds a mathematical model from a set of data that contains both the inputs and the desired outputs.
• In unsupervised learning, the algorithm builds a mathematical model from a set of data which contains only inputs and no desired output labels. Unsupervised learning algorithms are used to find structure in the data, like grouping or clustering of data points. Unsupervised learning can discover patterns in the data, and can group the inputs into categories, as in feature learning. Dimensionality reduction is the process of reducing the number of “features”, or inputs, in a set of data.
• Semi-supervised learning: algorithms develop mathematical models from incomplete training data, where a portion of the sample input doesn’t have labels
• data mining: Machine learning and data mining often employ the same methods and overlap significantly, but while machine learning focuses on prediction, based on known properties learned from the training data, data mining focuses on the discovery of (previously) unknown properties in the data (this is the analysis step of knowledge discovery in databases).
• optimization: Machine learning also has intimate ties to optimization: many learning problems are formulated as minimization of some loss function on a training set of examples. Loss functions express the discrepancy between the predictions of the model being trained and the actual problem instances
• statistics: Machine learning and statistics are closely related fields. According to Michael I. Jordan, the ideas of machine learning, from methodological principles to theoretical tools, have had a long pre-history in statistics. He also suggested the term data science as a placeholder to call the overall field.Leo Breiman distinguished two statistical modelling paradigms: data model and algorithmic model, wherein “algorithmic model” means more or less the machine learning algorithms like Random forest.
Some statisticians have adopted methods from machine leading to a combined field that they call statistical learning
Advantages of Machine learning
1. Easily identifies trends and patterns
Machine Learning can review large volumes of data and discover specific trends and patterns that would not be apparent to humans. For instance, for an e-commerce website like Amazon, it serves to understand the browsing behaviors and purchase histories of its users to help cater to the right products, deals, and reminders relevant to them. It uses the results to reveal relevant advertisements to them.
2. No human intervention needed (automation)
With ML, you don’t need to babysit your project every step of the way. Since it means giving machines the ability to learn, it lets them make predictions and also improve the algorithms on their own. A common example of this is anti-virus softwares; they learn to filter new threats as they are recognized. ML is also good at recognizing spam.
3. Continuous Improvement
As ML algorithms gain experience, they keep improving in accuracy and efficiency. This lets them make better decisions. Say you need to make a weather forecast model. As the amount of data you have keeps growing, your algorithms learn to make more accurate predictions faster.
4. Handling multi-dimensional and multi-variety data
Machine Learning algorithms are good at handling data that are multi-dimensional and multi-variety, and they can do this in dynamic or uncertain environments.
5. Wide Applications
You could be an e-tailer or a healthcare provider and make ML work for you. Where it does apply, it holds the capability to help deliver a much more personal experience to customers while also targeting the right customers.
Disadvantages of Machine Learning
With all those advantages to its powerfulness and popularity, Machine Learning isn’t perfect. The following factors serve to limit it:
1. Data Acquisition
Machine Learning requires massive data sets to train on, and these should be inclusive/unbiased, and of good quality. There can also be times where they must wait for new data to be generated.
2. Time and Resources
ML needs enough time to let the algorithms learn and develop enough to fulfill their purpose with a considerable amount of accuracy and relevancy. It also needs massive resources to function. This can mean additional requirements of computer power for you.
3. Interpretation of Results
Another major challenge is the ability to accurately interpret results generated by the algorithms. You must also carefully choose the algorithms for your purpose.
4. High error-susceptibility
Machine Learning is autonomous but highly susceptible to errors. Suppose you train an algorithm with data sets small enough to not be inclusive. You end up with biased predictions coming from a biased training set. This leads to irrelevant advertisements being displayed to customers. In the case of ML, such blunders can set off a chain of errors that can go undetected for long periods of time. And when they do get noticed, it takes quite some time to recognize the source of the issue, and even longer to correct it.