StATS: Lost files (May 23, 2006) Category: Data management
I work on these web pages from my desktop computer and two different laptop computers. I also have an Administrative Assistant who will sometimes update my web pages from her computer. In the middle of all of this, I ended up copying an old file on top of a new files and lost several weblog entries. With a bit of effort, I did find them in a backup zip file that I had made last week.
I think this is a sign that I am too busy and my life is too complicated. I need to create a foolproof system for entering weblog items, but it helps to keep in mind the following quote.
"Every day, man is making bigger and better fool-proof things, and every day, nature is making bigger and better fools. So far, I think nature is winning." - Albert Einstein, as found at www.quotedb.com/quotes/2701
I advise people all the time on how to set up a fool proof data entry system, and while you can't anticipate all of the possible things that can go wrong, here are some things that can sometimes help avert a disaster.
1. Store your data files on a network folder. If your network is like mine, it is run by people who know how to protect data, both from accidental loss, and from attacks from malicious hackers.
2. If you make changes to your data, allow for a way to gracefully back up to a previous version. You may regret your changes later. I use a series of names, either with a sequence number (atopic01.txt, atopic02.txt, etc.) or with an encoded date (web060315.zip, web060401.zip). I like to use a yymmdd format or a yyyymmdd format for encoded dates because then they sort well.
3. Document your data with such details as the units of measurement and the meanings of various categories. There are ways to make a data set self-documenting, but you could also just keep an extra sheet of paper or a file that has this information.
4. Have one person who is clearly "in charge." Anyone who assists needs to be sure that when they are working on the data entry, that the things they are doing do not clash with person in charge. Don't have two people doing data entry at the same time unless you have set up a database that is formally designed for multiple users.
5. Some programs will only pull up a single record of data at a time and if a power glitch strikes your computer, you never lose more than that single record. Other programs have an autosave feature (be sure you turn this on) that saves the file you are working on at predefined intervals. If your software for data entry has neither of these features, get in the habit of saving your data every five minutes or so.
I regularly counsel people who have lost data (fortunately, the loss of a LARGE amount of data is fairly rare). It's impossible to develop a system that is perfect, but if you are regularly finding yourself having to retype data, it might be worthwhile to talk about improving your data entry system.
Related web pages:
- Stats: More lessons learned the hard way (January 31, 2006)
- Stats: Hard learned lessons (November 25, 2005)
- Stats: Non-destructive data editing (November 2, 2005)
- Stats: Another disaster averted (August 16, 2005)
- Stats: Merging in R (July 26, 2005)
- Stats: Coding race/ethnicity (February 3, 2003)
- Stats: Date calculations in SPSS
- Stats: Documenting your SPSS data sets
- Stats: General guide to data entry
- Stats: Longitudinal data
- Stats: Merging files in SPSS
- Stats: Modifying SPSS data
- Stats: Spreadsheet or database
- Stats: Inputting a two-by-two table into SPSS
This work is licensed under a Creative Commons Attribution 3.0 United States License. It was written by Steve Simon.
This page was written by Steve Simon while working at Children's Mercy Hospital. Although I do not hold the copyright for this material, I am reproducing it here as a service, as it is no longer available on the Children's Mercy Hospital website. Need more information? I have a page with general help resources. You can also browse for pages similar to this one at