Privacy has become increasingly important as the Internet constitutes now part of our lives. Even though this word has many connotations in the society the following definition can be adopted for the case of web “privacy is the subjective condition a person experiences when two factors are in place, firstly he/she must have the power to control information about himself/herself and secondly he/she must exercise that control consistent with his/her interests and values” [Privacilla.org. Privacilla’s Two-Part Definition of Privacy ( http://www.privacilla.org/funda mentals/privacydefinition.html]. In other words, a web-site in order to obtain and analyze data on its the web users needs their permission.
Considerable controversy has been arisen recently on web personalization privacy issues [Volokh, E. (2000). Personalization and Privacy. Communications of the ACM, Vol. 43, No. 8, pp. 84-88, August 2000.]. It is obvious that personalization techniques require rich data from the users. A login (usually user’s name) and a password are not enough. Most web-users are not willing to give more information about themselves. They want to be assured that their personal information will not be shared with anyone else without their prior explicit permission.
So, personalization and therefore web mining techniques have to overcome the privacy problem in order to present satisfactory results. These novel technologies retrieve and analyze data from different sources in order to extract “hidden” information and knowledge. Consequently, it is obviously that privacy has been put in jeopardy since information about online user activities are recorded for constructing individual or group profiles. To maximize data gathering opportunities, web-sites collect data from every user touch point, online (registration, transactions, sign-ups, profiles, preferences, surveys, services, web log files, advertising banners, sweepstakes and other promotions requiring user’s data) and offline (services by phone, in-store transactions, paper submissions like sweepstake or promotion entries).
From the above the arisen question is how to protect people from the misuse of personal information on the web. This need for web user’s privacy has created a new market dedicated to design and develop products for protecting information privacy. There are many products for:
- Managing cookies: Burnt Cookies, Cookie Cruncher, Cookie Cutter, MagicCookie Monster, Spy Blocker, Cookie Pal, Cookie Master, Buzof, PGPcookie.cutter, NSClean & IEClean, Complete Cleanup, Cookie Terminator.
- Surfing anonymously through proxy servers: Anonymity 4 Proxy, PrivadaProxy, Internet Junkbuster Proxy, ProxyMate, Anonymizer, Naviscope.
- Encrypting e-mail: PGP, PEM, HushMail, Diasappearing E-mail, ZipLip Mail.
- Blocking unwanted files: AdSubtract SE, IDcide Privacy Companion.
- Cleaning residual files: Window Washer, Internet Guard Dog.
- Managing user’s identity: Freedom, Digitalme, Personal Child Persona.
- Purchasing anonymously: ZixCharge, iPrivacy.
- Maintaining user’s firewall: Norton Internet Security 2000.
- Searching the Internet privately: TopClick Private Web Search.
The last one was developed by the World Wide Web Consortium (W3C) in 1999 [P3P. Platform for Privacy Preferences Project (http://www.w3.org/P3P).]. P3P is a standard, which provides a simple and automated way for users to gain more control over their personal information when visiting web-sites. At its most basic level, P3P is a standardized set of multiple- choice questions, covering all the major aspects of a web-site’s privacy policies. Taken together, they present a clear snapshot of how a site handles personal information about its users. P3P-enabled web-sites make this information available in a standard, machine-readable format that P3P-enabled browsers can “read” automatically and compare it to the user’s own set of privacy preferences. P3P enhances user control by putting privacy policies where users can find them, in a form users can understand, and, most importantly, enables users to act on what they see.
Nowadays the new technologies/products for protecting user’s privacy on computers and networks are becoming increasingly popular. However, none can guarantee secure communications. So, electronic privacy issues in the foreseeable future will become highly crucial and intense.
In this framework, the full paper will present firstly some theoretical issues concerning the meaning and the significance of privacy. Then we will discuss the fine line between web personalization and personal intrusion and control when recording and modeling the user. The cases when this line is crossed will be also identified. The need for privacy has also created a market of products designed to protect user’s personal information. The solutions that technology can offer to website owners and visitors and the most well-known domain standards will be investigated. Finally, we will discuss open research issues since apart from the privacy threats that may emerge from the use of web mining for personalization, the same technology can also be deployed to identify privacy violations.
Furnell, S.M. and Karweni, T. (1999). Security Implications of Electronic Commerce: a Survey of Consumers and Businesses. Internet Research, Vol. 9, No. 5, pp. 372-382.
GVU – Graphics, Visualization and Usability Center of the Georgia Institute of Technology (1996). 6th WWW User Survey ( http://www.gvu.gatech.edu/user_surveys/survey-10-1996/#highsum).
Liu, C. and Arnett, K. (2002). An Examination of Privacy Policies in Fortune 500 Web Sites. Mid-American Journal of Business. Vol. 17, No. 1, pp.13.
P3P. Platform for Privacy Preferences Project (http://www.w3.org/P3P).
Personalization Consortium (http://www.personalization.org).
Privacilla.org. Privacilla’s Two-Part Definition of Privacy ( http://www.privacilla.org/funda mentals/privacydefinition.html).
Volokh, E. (2000). Personalization and Privacy. Communications of the ACM, Vol. 43, No. 8, pp. 84-88, August 2000.