Wednesday, April 13, 2011




How Can We Protect Privacy from Raw Data Distribution and Keep it Anonymous?

If we are going to data mine raw data from the Internet, smart phone communications, and online social networks then we need to make sure that our raw data is fact. We also need to understand that we will be getting a lot of data from many different places, and it will be distributed amongst different networks. If the raw data which is distributed is completely anonymous, and you're getting it from many sources or networks, and that data cannot be trusted, then any data mining you get will have a significant probability of the unfortunate errors.

In this case you could get too many false-positives, such as putting a normal and innocent civilian into a questionable category such as onto a watch list of terrorists for instance. And also realize that at each piece of raw data will have the ISP along with it, even if the individual user is not known. This causes challenges, and any particular ISP could end up with a national security letter being issued to them, asking for more information.

If you are someone who is running an ISP, the first thing you'd want to do is give the authorities the information that you knew about, and then try to find a reason to expel that particular individual from your network forever, and so won't be causing you any trouble the future. However, in this case the user who could be an innocent civilian has been damaged and kicked off the network for no particular reason, only due to bad data, or poor analytical skills of the intelligence community, a bad hunch, the wrong questions being asked, or an algorithm which is misaligned with the reality of online usage.

If every piece of data is tagged with a number of potential and probable trustworthiness of that piece of data then the data cannot remain anonymous. If it is tagged with a high number of probability of being correct good, but if it is not tagged with highest number, or if you don't have the actual user or location then you cannot verify if that information is valid or not.

In that case the anonymous raw data is not very important, because it cannot be trusted. Further, if the user happens to be in a cluster, or an area where there is a concentration of hate groups, homegrown terrorists, radical religious folks, or even terrorists, they could easily be flagged, or blocked into such a group, when in reality they are completely innocent, and thus just an anomaly of data.

The other concern is often good guys, or hobbyists, news scanners, and really non-intelligence industry personnel are also searching for the bad guys, and yet by doing so they get themselves on the same watch list as a false positive. And just because an individual is clustered within other data, it should not give the authorities meaningful probable cause to harass or cause that individual some sort of strife, such as being kicked off a network.

Please consider all this, our intelligence authorities have already made many mistakes, ruined people's lives, and continue to do this. I am not condemning the cause, rather warning of the challenges to personal freedom, and note, yes, there will be mistakes, but it takes a lot to fix mistakes later, when we should fix them in advance. I guess that's the difference between a genius and just another brilliant IT Planner.



Article Source: http://EzineArticles.com/?expert=Lance_Winslow

0 Comments:

Post a Comment

Subscribe to Post Comments [Atom]

<< Home