The Rule of Three
Make 3 copies (e.g. original + external/local + external/remote)
Copies should be geographically distributed (local vs. remote) if possible
Backup Options
Computers and work station hard drives, external hard drives, university servers
CDs and DVDs are not recommended
Cloud Storage
Data cleaning is the process of detecting, diagnosing, and editing faulty data.
There are several tools and software programs available to facilitate data cleaning and analysis:
OpenRefine is a free, open source tool for working with messy data: cleaning it; transforming it from one format into another; and extending it with web services and external data.
Tabula is a free, open source tool allowing users to extract data tables from PDFs and convert them into spreadsheets.
R is a free, open-source programming language that can be useful for data analysis in a variety of ways, from statistical analysis to data visualization.
Python is a general purpose programming language that can perform a variety of functions.
SPSS Statistics is a powerful statistical software platform.