Machine Learning Resources
Machine Learning at a Glance
Machine learning (ML) is a subset of Artificial Diligence. it aims to tackle a specific set of problems. In a traditional programming, we feed data and instructions/rules to the model. Model is nothing but a well written program. It generates output corresponding to input data and the instructions.
In many applications such as facial recognition, expert systems, prediction etc., specific rules cannot be written. In such a case, the traditional strategy fails. ML uses data and responses as input and generates mapping rules.
Despite the fact that it is not black magic, we must define mathematical procedures to build a relationship between data and output.
Once this mathematical model is used, it can be deployed to perform the specific task, for which it was trained.
Deep learning is special class of ML algorithms. They are based on deep neural networks. Typically used to solve problems of computer vision.
Classification of ML Algorithms
ML algorithm can be classified in broadly three categories:
- Supervised machine learning
- Unsupervised machine learning
- Reinforcement learning
Labeled data is required for supervised algorithms. By minimizing cost functions / prediction error, the model learns the relationship between data and labels. The core tasks in supervised ML are classification and regression.
Classification predicts the discrete / categorical value, whereas regression predicts the continuous / actual value. Classification includes predicting whether a student will pass or fail an exam based on his previous performance. The regression problem is defined as predicting the percentage of students who will take the next exam based on previous results. The categorization problem is “Will it rain tomorrow?” “How much rain will fall tomorrow?” – this is a regression problem.
Labels are absent in unsupervised learning. As a result, models are trained to locate patterns in input data by detecting similarities or data properties. The core tasks in unsupervised ML are clustering and anomaly detection. Clustering is a challenge that involves grouping pupils depending on their interests. Anomaly detection is the process of detecting fraudulent transactions in a transaction dataset.
Datasets for Machine Learning:
Quality data is trivial for success of ML models. Model suffer from poor quality. if quality of data is not good or training lacks sufficient data. Few of the popular ML datasets repositories are listed here:
- UCI ML Repository
- Open ML
- Registry of Open Data on AWS
- Microsoft Research Open Data
- Awesome Public Datasets
- Best Public Datasets for ML
- Analytics Vidya
- Visual Data Discovery
- Lion Bridge
- Dataset Search
- Data Quest