A-Z Popular Blog Data Search »
Data
 Advertisements
Technology Guides
Related Topics

10 Examples of Data Cleansing

 , updated on
Data cleansing is the process of detecting and correcting data quality issues. It typically includes both automatic steps such as queries designed to detect broken data and manual steps such as data wrangling. The following are common examples.

Corrupt Data

Data that is corrupted due to data rot is corrected using a historical backup.

Inconsistent Data

A report is constructed to identify differences in customer data between several systems. The report identifies close to 500 significant inconsistencies that are manually corrected before migrating the data into a master data management tool.

Inaccurate Data

A product catalog database was launched with accurate data but wasn't maintained over time. A team updates the information using documents such as product specifications before sharing the data with an ecommerce partner.

Irrelevant Data

A data quality initiative merges four sources of customer information into a common customer data record. A common data model is designed and irrelevant data that was captured by one of the sources is dropped.

Dirty Data

An ecommerce site imports product data from hundreds of partners on a regular basis. As the data comes in a variety of formats, it is often imported incorrectly over the years resulting in dirty data that is costly to fix. The site decides to launch a new extranet for partners to import and manage product data. They require all partners to review their data and make fixes or risk being delisted from the site.

Typographical Errors

A hotel booking site scans for reviews that have a large number of typographical errors and drops them from their database.

Standardization

A telecom billing database has over 1,200 services that get listed on customer bills. The company currently provides only 92 unique services. A quality assurance team gets complaints from customers who don't understand their bills. They develop a mapping for the 1,200 service descriptions to the actual 92 current services and correct the data.

Referential Integrity

A data migration project aims to migrate historical sales orders to a new sales analytics platform. The project finds that database constraints such as foreign key constraints weren't properly implemented. As a result, the structure of data is broken. The migration project runs multiple scripts to identify broken references and fix them.

Completeness

A new feature is launched to a customer portal that requires a customer postal code. The project finds that postal codes are missing for 800 customers. They create a script to query the postal codes based on telephone number using a third party data provider.

Characters

An integration project finds that scripts are failing due to special characters that a sensitive parser can't handle. A script is run to convert the characters in the database to a standard character.
Overview: Data Cleansing
Type
Definition
The process of detecting and correcting data quality issues.
Related Concepts

Data

This is the complete list of articles we have written about data.
Abstract Data
Atomic Data
Big Data
Causality
Cohort
Cohort Analysis
Dark Data
Data
Data Analysis
Data Architecture
Data Attribute
Data Cleansing
Data Collection
Data Complexity
Data Consumer
Data Control
Data Corruption
Data Custodian
Data Degradation
Data Dredging
Data Entity
Data Federation
Data Integration
Data Integrity
Data Liberation
Data Lineage
Data Literacy
Data Loss
Data Management
Data Massage
Data Migration
Data Mining
Data Owner
Data Producer
Data Quality
Data Remanence
Data Risks
Data Rot
Data Science
Data Security
Data States
Data Transformation
Data Uncertainty
Data Veracity
Data View
Data Virtualization
Data Volume
Data Wipe
Decision Support
Deep Magic
Degaussing
Empirical Evidence
ETL
Event Data
Hard Data
Information Assurance
Legacy Data
Machine Data
Market Research
Master Data
Metadata
Metrics
Misuse of Statistics
Overfitting
Personal Data
Personal Information
Predictive Analytics
Primary Data
Primary Research
Privacy
Qualitative Data
Qualitative Info
Quantification
Quantitative Data
Raw Data
Reference Data
Small Data
Soft Data
Source Data
Statistical Analysis
Statistical Population
Structured Data
Transaction Processing
Transactional Data
Types Of Data
Unstructured Data
If you enjoyed this page, please consider bookmarking Simplicable.
 

Data Quality

An overview of data quality criteria.

Data Corruption

An overview of data corruption.

Data Integrity

An overview of data integrity.

Data Rot

An overview of data rot.

Data Integrity vs Data Quality

The difference between data integrity and data quality.

Data Artifact

The common types of data artifact.

Data Veracity

A definition of data veracity with examples.

Data Quality Examples

An overview of data quality with examples.

Legacy Data

An overview of legacy data with examples.

Data Management

An overview of data management with examples.

Data Governance vs Data Management

The difference between data governance and data management.

Data Liberation

An overview of data liberation.

Master Data Management

An overview of master data management.

Single Source Of Truth

A definition of single source of truth, a data management strategy.

Data Escrow

An overview of data escrow.

Data Availability

An overview of data availability.

Data Proliferation

A definition of data proliferation with examples.

Namespace

The definition of namespace with examples.

Data Science Skills

An list of commonly cited data science skills.
The most popular articles on Simplicable in the past day.

New Articles

Recent posts or updates on Simplicable.
Site Map