It’s good to have an understanding of the jargon of data.
First of all, a bit (which is short for “binary digit”) is the smallest form of data in a computer. Think of it as the atom. A bit can either be 0 or 1, which is binary. It is also generally used to measure the amount of data that is being transferred (say within a network or the Internet).
A byte, on the other hand, is mostly for storage. Of course, the numbers of bytes can get large very fast. Let’s see how in Table 2-1.
Table 2-1.
Types of data levels
| Unit | Value | Use Case |
|---|---|---|
| Megabyte | 1,000 kilobytes | A small book |
| Gigabyte | 1,000 megabytes | About 230 songs |
| Terabyte | 1,000 gigabytes | 500 hours of movies |
| Petabyte | 1,000 terabytes | Five years of the Earth Observing System (EOS) |
| Exabyte | 1,000 petabytes | The entire Library of Congress 3,000 times over |
| Zettabyte | 1,000 exabytes | 36,000 years of HD-TV video |
| Yottabytes | 1,000 zettabytes | This would require a data center the size of Delaware and Rhode Island combined |
Data can also come from many different sources. Here is just a sampling:
- Web/social (Facebook, Twitter, Instagram, YouTube)
- Biometric data (fitness trackers, genetics tests)
- Point of sale systems (from brick-and-mortar stores and e-commerce sites)
- Internet of Things or IoT (ID tags and smart devices)
- Cloud systems (business applications like Salesforce.com)
- Corporate databases and spreadsheets

Leave a Reply