HR Data Quality · Skills demonstration

Cleaning an HR dataset, rule by rule

I generate a synthetic employee dataset with realistic data-quality defects, profile it in SQL, then clean it with a documented, logged pipeline. This page shows the impact and lets you compare individual records before and after.

All data shown is synthetic, generated for demonstration. It is not real employee data and is not affiliated with any employer.

Cleaning impact

Headline numbers from the most recent pipeline run.

Record-level before / after

Example records chosen to show different defects. Changed fields are highlighted. Toggle to switch every card between its raw and cleaned state.

View:

Rules applied

Every rule is named, logged, and reports how many rows it affected.

RuleNameDecisionRows affected