Data profiling is the process of analyzing a dataset. It is typically done to support data governance, data management or to make decisions about the viability of strategies and projects that require data. The following are common types of data profiling.
Data QualityGathering statistics about data quality. For example, a telecom company might determine the correctness of customer data by comparing two sources or validating the data using a set of business rules.
Data CredibilityAnalysis of the credibility of data. For example, an investor might evaluate a set of historical social media data to see if there is any useful correlation between social media chatter and stock prices.
Data LineageTracing data to its sources and calculation methods.
Compliance & RisksAnalysis of data for compliance and risk purposes. For example, verifying that a dataset doesn't contain personally identifiable data.
Information SecurityAnalysis of data for purposes of information security such as verifying that fields are properly encrypted.
Capacity ManagementLooking at how data is growing in order to plan capacity and budget.
RetentionEvaluating data in order to determine a retention schedule. For example, a team may have mysterious pools of dark data that it would like to purge but seek statistics to confirm the data isn't used.
This is the complete list of articles we have written about data.
If you enjoyed this page, please consider bookmarking Simplicable.
© 2010-2023 Simplicable. All Rights Reserved. Reproduction of materials found on this site, in any form, without explicit permission is prohibited.
View credits & copyrights or citation information for this page.