{"id":3422,"date":"2024-09-01T13:53:39","date_gmt":"2024-09-01T13:53:39","guid":{"rendered":"https:\/\/workhouse.sweetdishy.com\/?p=3422"},"modified":"2024-09-01T13:53:39","modified_gmt":"2024-09-01T13:53:39","slug":"naive-bayes-classifier-supervised-learning-classification","status":"publish","type":"post","link":"https:\/\/workhouse.sweetdishy.com\/index.php\/2024\/09\/01\/naive-bayes-classifier-supervised-learning-classification\/","title":{"rendered":"Na\u00efve Bayes Classifier\u00a0(Supervised Learning\/Classification)"},"content":{"rendered":"\n<p id=\"Par147\">Earlier in this chapter, we looked at Bayes\u2019 theorem. As for machine learning, this has been modified into something called the Na\u00efve Bayes Classifier. It is \u201cna\u00efve\u201d because the assumption is that the variables are independent from each other\u2014that is, the occurrence of one variable has nothing to do with the others. True, this may seem like a drawback. But the fact is that the Na\u00efve Bayes Classifier has proven to be quite effective and fast to develop.<\/p>\n\n\n\n<p id=\"Par148\">There is another assumption to note as well: the a priori assumption. This says that the predictions will be wrong if the data has changed.<\/p>\n\n\n\n<p>There are three variations on the&nbsp;Na\u00efve Bayes Classifier&nbsp;:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><em>Bernoulli<\/em>: This is if you have binary data (true\/false, yes\/no).<\/li>\n\n\n\n<li><em>Multinomial<\/em>: This is if the data is discrete, such as the number of pages of a book.<\/li>\n\n\n\n<li><em>Gaussian<\/em>: This is if you are working with data that conforms to a normal distribution.<\/li>\n<\/ul>\n\n\n\n<p id=\"Par153\">A common use case for Na\u00efve Bayes Classifiers is text analysis. Examples include email spam detection, customer segmentation,&nbsp;sentiment analysis&nbsp;, medical diagnosis, and weather predictions. The reason is that this approach is useful in classifying data based on key features and patterns.<\/p>\n\n\n\n<p id=\"Par154\">To see how this is done, let\u2019s take an example: Suppose you run an e-commerce site and have a large database of customer transactions. You want to see how variables like product review ratings, discounts, and time of year impact&nbsp;sales.<\/p>\n\n\n\n<p>Table\u00a03-2\u00a0shows a look at the dataset.<\/p>\n\n\n\n<p><strong><em>Table 3-2.<\/em><\/strong><\/p>\n\n\n\n<p>Customer transactions dataset<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table><thead><tr><th>Discount<\/th><th>Product Review<\/th><th>Purchase<\/th><\/tr><\/thead><tbody><tr><td>Yes<\/td><td>High<\/td><td>Yes<\/td><\/tr><tr><td>Yes<\/td><td>Low<\/td><td>Yes<\/td><\/tr><tr><td>No<\/td><td>Low<\/td><td>No<\/td><\/tr><tr><td>No<\/td><td>Low<\/td><td>No<\/td><\/tr><tr><td>No<\/td><td>Low<\/td><td>No<\/td><\/tr><tr><td>No<\/td><td>High<\/td><td>Yes<\/td><\/tr><tr><td>Yes<\/td><td>High<\/td><td>No<\/td><\/tr><tr><td>Yes<\/td><td>Low<\/td><td>Yes<\/td><\/tr><tr><td>No<\/td><td>High<\/td><td>Yes<\/td><\/tr><tr><td>Yes<\/td><td>High<\/td><td>Yes<\/td><\/tr><tr><td>No<\/td><td>High<\/td><td>No<\/td><\/tr><tr><td>No<\/td><td>Low<\/td><td>Yes<\/td><\/tr><tr><td>Yes<\/td><td>High<\/td><td>Yes<\/td><\/tr><tr><td>Yes<\/td><td>Low<\/td><td>No<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p>You will then organize this data into frequency\u00a0tables\u00a0, as shown in Tables\u00a0<a href=\"https:\/\/learning.oreilly.com\/library\/view\/artificial-intelligence-basics\/9781484250280\/html\/480660_1_En_3_Chapter.xhtml#Tab3\">3-<\/a>3\u00a0and\u00a03-4.<\/p>\n\n\n\n<p><strong><em>Table 3-3.<\/em><\/strong><\/p>\n\n\n\n<p>Discount frequency table<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table><thead><tr><th colspan=\"2\" rowspan=\"2\">&nbsp;<\/th><th colspan=\"2\">Purchase<\/th><\/tr><tr><th>Yes<\/th><th>No<\/th><\/tr><\/thead><tbody><tr><td rowspan=\"2\"><strong>Discount<\/strong><\/td><td>Yes<\/td><td>19<\/td><td>1<\/td><\/tr><tr><td>Yes<\/td><td>5<\/td><td>5<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p><strong><em>Table 3-4.<\/em><\/strong><\/p>\n\n\n\n<p>Product review frequency table<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table><thead><tr><th colspan=\"2\" rowspan=\"2\">&nbsp;<\/th><th colspan=\"2\">Purchase<\/th><th>&nbsp;<\/th><\/tr><tr><th>Yes<\/th><th>No<\/th><th>Total<\/th><\/tr><\/thead><tbody><tr><td rowspan=\"2\"><strong>Product Review<\/strong><\/td><td><strong>High<\/strong><\/td><td>21<\/td><td>2<\/td><td>11<\/td><\/tr><tr><td><strong>Low<\/strong><\/td><td>3<\/td><td>4<\/td><td>8<\/td><\/tr><tr><td>&nbsp;<\/td><td><strong>Total<\/strong><\/td><td>24<\/td><td>6<\/td><td>19<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p>When looking at this, we call the purchase an event and the discount and product reviews as independent\u00a0variables\u00a0. Then we can make a probability table for one of the independent variables, say the product reviews. See Table\u00a03-5.<\/p>\n\n\n\n<p><strong><em>Table 3-5.<\/em><\/strong><\/p>\n\n\n\n<p>Product review probability table<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table><thead><tr><th colspan=\"2\" rowspan=\"2\">&nbsp;<\/th><th colspan=\"2\">Purchase<\/th><th>&nbsp;<\/th><\/tr><tr><th>Yes<\/th><th>No<\/th><th>&nbsp;<\/th><\/tr><\/thead><tbody><tr><td rowspan=\"3\"><strong>Product Reviews<\/strong><\/td><td><strong>High<\/strong><\/td><td>9\/24<\/td><td>2\/6<\/td><td>11\/30<\/td><\/tr><tr><td><strong>Low<\/strong><\/td><td>7\/24<\/td><td>1\/6<\/td><td>8\/30<\/td><\/tr><tr><td>&nbsp;<\/td><td>24\/30<\/td><td>6\/30<\/td><td>&nbsp;<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p id=\"Par158\">Using this chart, we can see that the probability of a purchase when there is a low product review is 7\/24 or 29%. In other words, the Na\u00efve Bayes Classifier allows more granular&nbsp;predictions&nbsp;within a dataset. It is also relatively easy to train and can work well with small datasets.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Earlier in this chapter, we looked at Bayes\u2019 theorem. As for machine learning, this has been modified into something called the Na\u00efve Bayes Classifier. It is \u201cna\u00efve\u201d because the assumption is that the variables are independent from each other\u2014that is, the occurrence of one variable has nothing to do with the others. True, this may [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":3326,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[441],"tags":[],"class_list":["post-3422","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-3-machine-learning"],"jetpack_featured_media_url":"https:\/\/workhouse.sweetdishy.com\/wp-content\/uploads\/2024\/08\/images-41-1.jpeg","_links":{"self":[{"href":"https:\/\/workhouse.sweetdishy.com\/index.php\/wp-json\/wp\/v2\/posts\/3422","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/workhouse.sweetdishy.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/workhouse.sweetdishy.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/workhouse.sweetdishy.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/workhouse.sweetdishy.com\/index.php\/wp-json\/wp\/v2\/comments?post=3422"}],"version-history":[{"count":1,"href":"https:\/\/workhouse.sweetdishy.com\/index.php\/wp-json\/wp\/v2\/posts\/3422\/revisions"}],"predecessor-version":[{"id":3423,"href":"https:\/\/workhouse.sweetdishy.com\/index.php\/wp-json\/wp\/v2\/posts\/3422\/revisions\/3423"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/workhouse.sweetdishy.com\/index.php\/wp-json\/wp\/v2\/media\/3326"}],"wp:attachment":[{"href":"https:\/\/workhouse.sweetdishy.com\/index.php\/wp-json\/wp\/v2\/media?parent=3422"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/workhouse.sweetdishy.com\/index.php\/wp-json\/wp\/v2\/categories?post=3422"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/workhouse.sweetdishy.com\/index.php\/wp-json\/wp\/v2\/tags?post=3422"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}