Sensor DataSensors such as cameras can produce a large amount of data that is often streamed 24/7 such that it gets large fast. If this is permanently archived it can eventually represent a large recurring cost to maintain.
User InputIt is common for modern user interfaces to record detailed user input and events such as a user who hovers over an element. This may have little value from the start and quickly loses value with time. copious amount of data such as diagnostic data that is quickly worthless. Nevertheless, an organization may lack the capabilities to identify such data as obsolete or make the decision to delete it.
Redundant DataIt is common for important data such as customer data to be copied to many databases where it may become out-of-date. It may be difficult to clean such data as the authoritative source may be unknown. Administrators of various systems may view such redundant data as important and be hesitant to delete it, even if it is unused.
Secondary DataData that is calculated from other sources such that it is essentially redundant.tied to a person. This is an important type of dark data to delete because it represents an information security, reputational and legal risk. People may have a reasonable expectation that such information be deleted in a timely manner. For example, security camera footage from a pool that is for security purposes will typically be deleted once it is clear no security incident has occurred.
Legacy SystemsLegacy systems may accumulate large databases that fall into disuse. Such systems may not be completely unused such that they need to be maintained. In many cases, nobody within an organization understands the data model of a legacy system such that it is very difficult to retire any data from it.
Expired KnowledgeData produced by knowledge workers that no longer has any value. For example, versioned meeting minutes from a project that went into production a decade ago such that there is no realistic chance anyone will find them useful. metadata to describe this data or may lack the processes and tools to ensure it gets used. As such, it is common for organizations to reinvent knowledge that they have already documented.
|Overview: Dark Data|
Data that goes unused.