Key Takeaways

Machine learning, whose roots go back to the 1950s, is where a computer can learn without being explicitly programmed. Rather, it will ingest and process data by using sophisticated statistical techniques.
An outlier is data that is far outside the rest of the numbers in the dataset.
The standard deviation measures the average distance from the mean.
The normal distribution—which has a shape like a bell—represents the sum of probabilities for a variable.
The Bayes’ theorem is a sophisticated statistical technique that provides a deeper look at probabilities.
A true positive is when a model makes a correct prediction. A false positive, on the other hand, is when a model prediction shows that the result is true even though it is not.
The Pearson correlation shows the strength of the relationship between two variables that range from 1 to -1.
Feature extraction or feature engineering describes the process of selecting variables for a model. This is critical since even one wrong variable can have a major impact on the results.
Training data is what is used to create the relationships in an algorithm. The test data, on the other hand, is used to evaluate the model.
Supervised learning uses labeled data to create a model, whereas unsupervised learning does not. There is also semi-supervised learning, which uses a mix of both approaches.
Reinforcement learning is a way to train a model by rewarding accurate predictions and punishing those that are not.
The k-Nearest Neighbor (k-NN ) is an algorithm based on the notion that values that are close together are good predictors for a model.
Linear regression estimates the relationship between certain variables. The R-squared will indicate the strength of the relationship.
A decision tree is a model that is based on a workflow of yes/no decisions.
An ensemble model uses more than one model for the predictions.
The k-Means clustering algorithm puts similar unlabeled data into different groups.

Comments

Leave a Reply Cancel reply