Category: 2. Data

  • Mining Insights from Data

    Mining Insights from Data

    A breakthrough in machine learning would be worth ten Microsofts. —Bill Gates1 While Katrina Lake liked to shop online, she knew the experience could be much better. The main problem: It was tough to find fashions that were personalized. So began the inspiration for Stitch Fix, which Katrina launched in her Cambridge apartment while attending Harvard Business School…

  • Key Takeaways
  • More Data Terms and Concepts

    More Data Terms and Concepts

    When engaging in data analysis, you should know the basic terms. Here are some that you’ll often hear: Categorical Data: This is data that does not have a numerical meaning. Rather, it has a textual meaning like a description of a group (race and gender). Although, you can assign numbers to each of the elements.…

  • How Much Data Do You Need for AI?

    How Much Data Do You Need for AI?

    The more data, the better, right? This is usually the case. Look at something called Hughes Phenomenon. This posits that as you add features to a model, the performance generally increases. But quantity is not the end-all, be-all. There may come a point where the data starts to degrade. Keep in mind that you may…

  • Ethics and Governance

    Ethics and Governance

    You need to be mindful of any restrictions on the data. Might the vendor prohibit you from using the information for certain purposes? Perhaps your company will be on the hook if something goes wrong? To deal with these issues, it is advisable to have the legal department brought in. For the most part, data must…

  • Data Process

    Data Process

    The amount of money shelled out on data is enormous. According to IDC, the spending on Big Data and analytics solutions is forecasted to go from $166 billion in 2018 to $260 billion by 2022.11 This represents an 11.9% compound annual growth rate. The biggest spenders include banks, discrete manufacturers, process manufacturers, professional service firms, and…

  • Databases and Other Tools

    Databases and Other Tools

    There are a myriad of tools that help with data. At the core of this is the database. As should be no surprise, there has been an evolution of this critical technology over the decades. But even older technologies like relational databases are still very much in use today. When it comes to mission-critical data, companies are…

  • Velocity

    Velocity

    This shows the speed at which data is being created. As seen earlier in this chapter, services like YouTube and Snapchat have extreme levels of velocity (this is often referred to as a “firehouse” of data). This requires heavy investments in next-generation technologies and data centers. The data is also often processed in memory not with disk-based…

  • Variety

    Variety

    This describes the diversity of the data, say a combination of structured, semi-structured, and unstructured data (explained above). It also shows the different sources of the data and uses. No doubt, the high growth in unstructured data has been a key to the variety of Big Data. Managing this can quickly become a major challenge. Yet machine learning…

  • Volume

    Volume

    This is the scale of the data, which is often unstructured. There is no hard-and-fast rule on a threshold, but it is usually tens of terabytes. Volume is often a major challenge when it comes to Big Data. But cloud computing and next-generation databases have been a big help—in terms of capacity and lower costs.