- Machine learning, whose roots go back to the 1950s, is where a computer can learn without being explicitly programmed. Rather, it will ingest and process data by using sophisticated statistical techniques.
- An outlier is data that is far outside the rest of the numbers in the dataset.
- The standard deviation measures the average distance from the mean.
- The normal distribution—which has a shape like a bell—represents the sum of probabilities for a variable.
- The Bayes’ theorem is a sophisticated statistical technique that provides a deeper look at probabilities.
- A true positive is when a model makes a correct prediction. A false positive, on the other hand, is when a model prediction shows that the result is true even though it is not.
- The Pearson correlation shows the strength of the relationship between two variables that range from 1 to -1.
- Feature extraction or feature engineering describes the process of selecting variables for a model. This is critical since even one wrong variable can have a major impact on the results.
- Training data is what is used to create the relationships in an algorithm. The test data, on the other hand, is used to evaluate the model.
- Supervised learning uses labeled data to create a model, whereas unsupervised learning does not. There is also semi-supervised learning, which uses a mix of both approaches.
- Reinforcement learning is a way to train a model by rewarding accurate predictions and punishing those that are not.
- The k-Nearest Neighbor (k-NN ) is an algorithm based on the notion that values that are close together are good predictors for a model.
- Linear regression estimates the relationship between certain variables. The R-squared will indicate the strength of the relationship.
- A decision tree is a model that is based on a workflow of yes/no decisions.
- An ensemble model uses more than one model for the predictions.
- The k-Means clustering algorithm puts similar unlabeled data into different groups.

Leave a Reply