Like war, data cleaning is tedium marked by moments of sheer terror.
The purpose of data cleaning is to ensure that the values on which you will base your analysis are correct (or at least not obviously erroneous). The process of reviewing columns of figures for stray commas or (here’s the terrifying bit) finding that a value for one observation is erroneously associated with another observation demands high focus but provides few thrills.
I’m currently cleaning data for the Global Digital Activism Data Set (yes, after 3 years we will very shortly have results worth using). In order to keep my brain turned on I am relying on pop music (Katy Pearce also recommends this method for energized academic writing). My current favorite: Robyn. Enjoy with me:
Cross-posted from the Digital Activism Research Project