
Sensor Data
Sensors such as cameras can produce a large amount of data that is often streamed 24/7 such that it gets large fast. If this is permanently archived it can eventually represent a large recurring cost to maintain.User Input
It is common for modern user interfaces to record detailed user input and events such as a user who hovers over an element. This may have little value from the start and quickly loses value with time.Machine Data
Machines can produce copious amount of data such as diagnostic data that is quickly worthless. Nevertheless, an organization may lack the capabilities to identify such data as obsolete or make the decision to delete it.Redundant Data
It is common for important data such as customer data to be copied to many databases where it may become out-of-date. It may be difficult to clean such data as the authoritative source may be unknown. Administrators of various systems may view such redundant data as important and be hesitant to delete it, even if it is unused.Secondary Data
Data that is calculated from other sources such that it is essentially redundant.Personally Identifiable Data
Data that can be tied to a person. This is an important type of dark data to delete because it represents an information security, reputational and legal risk. People may have a reasonable expectation that such information be deleted in a timely manner. For example, security camera footage from a pool that is for security purposes will typically be deleted once it is clear no security incident has occurred.Legacy Systems
Legacy systems may accumulate large databases that fall into disuse. Such systems may not be completely unused such that they need to be maintained. In many cases, nobody within an organization understands the data model of a legacy system such that it is very difficult to retire any data from it.Expired Knowledge
Data produced by knowledge workers that no longer has any value. For example, versioned meeting minutes from a project that went into production a decade ago such that there is no realistic chance anyone will find them useful.Knowledge Waste
Knowledge workers commonly produce hundreds of documents a year. In many cases, the vast majority of this knowledge is never used but is maintained in the event that it may be required. Organizations may have little metadata to describe this data or may lack the processes and tools to ensure it gets used. As such, it is common for organizations to reinvent knowledge that they have already documented.Overview: Dark Data | ||
Type | ||
Definition | Data that goes unused. | |
Related Concepts |