Data Quality and “Damn You, Auto Correct!”

This past week a Georgia high school student sent a text message that was supposed to read “gunna be at west hall today”.  Unfortunately the auto-correct feature on his smart phone changed the first word to “gunman”.  To make matters worse, he evidently fat fingered the address he sent the text to so the person receiving it didn’t know who he was so they took the appropriate cautionary step and contacted the police.  The result?  A high school was shut down for a few hours until things were straightened out.

There are so many directions to go with this.  We could focus on the auto-correct feature and how this tool seems to have caused much frustration and laughter by automatically mistyping things for people – though I do question how many items at a site like “damnyouautocorrect.com” (DYAC) are truly accidental.  Or we could focus on the myriad of interesting things I found while researching this blog.  How about the site that teaches you How to Make Siri Curse Like a Sailor (careful if you are easily offended) or Tips to Add Words to iPhone’s Dictionary (for older versions of iPhone) and iOS 5: How to add words to the auto-correct dictionary for iOS5.  Or, if you like the way the Android presents predictive text with the keyboard bar, Enable iOS 5’s Android-Like Autocorrect Keyboard Without Jailbreak does just what it says it does.

Or we could discuss the fat fingering of the address.  This actually is of most interest to me.  I’ve researched keystroke error rates in the computer space before and concluded the average typist fat fingers a keystroke about 5% of the time while professionals might do so 1%-2% of the time.  The thing about these percentages is they can be very deceiving.  5% error rates can explode in the wrong situations.  For instance, imagine a 10 character-long field.  If you had to fill that field in 10 times and you averaged 5% error rate in your typing, the best effective ongoing error rate you could achieve would be 10% because across 10 fields you make all the errors in one field (5 of the 10 characters are wrong in that one field) but all the other fields are correct.  1 in 10 fields being incorrect is a 10% error rate.

At worst your effective error rate would be 50%.  This would occur if you made one keystroke error in each of 5 fields since you will be typing 100 characters and you have a 5% error rate.  If 5 of 10 fields each have 1 error, your overall error rate across the fields is 50%.  Corporate data can go really bad really quickly in this scenario.  Of course most systems have found ways to limit these types of issues through check boxes and drop-down lists and other validations.  But sometimes data entry can’t be avoided.  Entering customer contact information is one such situation.  Get 5 in 10 email addresses wrong and there goes your email marketing campaign.  Even if you are careful error rates can be significant.  I read a study some time ago that indicated that careful typists often catch their mistakes, so professionals usually average only 1 in 300 unfound errors.  That’s much better, but still can translate into problems for your enterprise.  Using the same 10 character field, 1 in 300 errors would equate to 1 in 30 fields being erroneous.  That’s over 3% – still pretty bad, but I know some organizations that would love to have data quality problems with only 3% of their data.

So this, now, brings me back to where we started.  I’ve got to believe that without autocorrect, the keystroke error rates on smart phones is significantly higher than it otherwise would be.  Typing on those tiny keyboards is always a pain for me.    I’ve found on my phone that the right place to touch in order to get the letter I want is just to the left of the letter – not right on it.  I’m constantly back-spacing and retyping.  Perhaps this is why there are so many bizarre autocorrect examples in the world (no, I’m not saying I’m responsible for all of them – just that other people must have similar challenges to mine, don’t they?).  I miss the Blackberry keypad!

I think organizations should be very wary of leveraging smart phones for serious business apps that require data entry, unless they have extremely strong user data validation methodologies. Because I expect few developers to be so vigilant in their app development, I believe the future could bring some very unfortunate results from using business apps on smart phones.  While DYAC entries sure can be funny (warning, they can also be quite vulgar) the same errors in business transactions could be catastrophic for your enterprise.  Imagine things going terribly wrong and facing a lawsuit and having to use the “damnyouautocorrect” defense.  You might end up in a much worse situation than a couple hours of high school lockdown.