Hiding in plain sight - Dirty Data and Fake Locations


Dirty Data is a term used when referring to inaccurate information or data collected during capture.  Dirty data can be misleading, incorrect, without generalised formatting, incorrectly spelled or punctuated, entered into the wrong field or duplicated. Dirty data can be prevented using input masks or validation rules, but completely removing such data from a source can be impossible or impractical

There are several causes of dirty data. In some cases, the information is deliberately distorted. A person may insert misleading or fictional personal information which appears real. Such dirty data may not be picked up by an administrator or a validation routine because it appears legitimate. Duplicate data can be caused by repeat submissions, user error or incorrect data joining. There can also be formatting issues or typographical errors. A common formatting issue is caused by variations in a user's preference for entering phone numbers.

The reason for looking at this topic was due to the requirement asked of me the other day for "A Fake Location"  A user wanted to be somewhere that did not exist and therefore the data from their tweets would not lead fans to find them or track them.  It would allow them to hide in plane site.  The concept quickly became location, IP, caller ID and other data that would identify them.

However, worth noting this research paper where re-construction of your ID, even with fake data, is possible.  There is not hiding.....

For the rest of us fake locations apps for Android  are here http://www.androidzoom.com/android_applications/fake%20locations