Web Personalization and the Privacy Concern

AUTHOR
Konstantinos Markellos, Penelope Markellou, Maria Rigou, Spiros Sirmakessis and Athanasios Tsakalidis

ABSTRACT

Privacy has become increasingly important as the Internet constitutes now part of our lives. Even though this word has many connotations in the society the following definition can be adopted for the case of web “privacy is the subjective condition a person experiences when two factors are in place, firstly he/she must have the power to control information about himself/herself and secondly he/she must exercise that control consistent with his/her interests and values” [Privacilla.org. Privacilla’s Two-Part Definition of Privacy ( http://www.privacilla.org/funda mentals/privacydefinition.html]. In other words, a web-site in order to obtain and analyze data on its the web users needs their permission.

Considerable controversy has been arisen recently on web personalization privacy issues [Volokh, E. (2000). Personalization and Privacy. Communications of the ACM, Vol. 43, No. 8, pp. 84-88, August 2000.]. It is obvious that personalization techniques require rich data from the users. A login (usually user’s name) and a password are not enough. Most web-users are not willing to give more information about themselves. They want to be assured that their personal information will not be shared with anyone else without their prior explicit permission.

The 6th WWW User Survey conducted by the Graphics, Visualization and Usability Center of the Georgia Institute of Technology showed that the main reason for not registering in a web-site is that the terms/conditions of how the collected information is going to be used are not clearly specified (70%) [GVU – Graphics, Visualization and Usability Center of the Georgia Institute of Technology (1996). 6th WWW User Survey (http://www.gvu.gatech.edu/ user_surveys/survey-10-1996/#highsum]. Another survey conducted by the Personalization Consortium indicated that privacy issues are important for the users but they would share personal information in exchange for better services [Personalization Consortium (http://www.personalization.org).]. Moreover, 58% of users require a privacy statement from the web-site and even 51% read it before registering on the site. Furnell and Karweni [Furnell, S.M. and Karweni, T. (1999). Security Implications of Electronic Commerce: a Survey of Consumers and Businesses. Internet Research, Vol. 9, No. 5, pp. 372-382.] in their study found that 87.5% of surveyed consumers expect to see comprehensive information regarding privacy policy when visiting a commerce web-site. On the other hand survey in [Liu, C. and Arnett, K. (2002). An Examination of Privacy Policies in Fortune 500 Web Sites. Mid-American Journal of Business. Vol. 17, No. 1, pp.13.] examined web-sites of the Fortune 500 and showed that slightly more that 50% of sites provide privacy policies on their home pages.

So, personalization and therefore web mining techniques have to overcome the privacy problem in order to present satisfactory results. These novel technologies retrieve and analyze data from different sources in order to extract “hidden” information and knowledge. Consequently, it is obviously that privacy has been put in jeopardy since information about online user activities are recorded for constructing individual or group profiles. To maximize data gathering opportunities, web-sites collect data from every user touch point, online (registration, transactions, sign-ups, profiles, preferences, surveys, services, web log files, advertising banners, sweepstakes and other promotions requiring user’s data) and offline (services by phone, in-store transactions, paper submissions like sweepstake or promotion entries).

From the above the arisen question is how to protect people from the misuse of personal information on the web. This need for web user’s privacy has created a new market dedicated to design and develop products for protecting information privacy. There are many products for:

  • Managing cookies: Burnt Cookies, Cookie Cruncher, Cookie Cutter, MagicCookie Monster, Spy Blocker, Cookie Pal, Cookie Master, Buzof, PGPcookie.cutter, NSClean & IEClean, Complete Cleanup, Cookie Terminator.
  • Surfing anonymously through proxy servers: Anonymity 4 Proxy, PrivadaProxy, Internet Junkbuster Proxy, ProxyMate, Anonymizer, Naviscope.
  • Encrypting e-mail: PGP, PEM, HushMail, Diasappearing E-mail, ZipLip Mail.
  • Blocking unwanted files: AdSubtract SE, IDcide Privacy Companion.
  • Cleaning residual files: Window Washer, Internet Guard Dog.
  • Managing user’s identity: Freedom, Digitalme, Personal Child Persona.
  • Purchasing anonymously: ZixCharge, iPrivacy.
  • Maintaining user’s firewall: Norton Internet Security 2000.
  • Searching the Internet privately: TopClick Private Web Search.
  • Generating privacy policy: Privacy Wizard, DMA’s Privacy Policy Generator, OECD Privacy Policy Generator, Policy Editor, P3Pwriter Privacy Policy Editor.

The last one was developed by the World Wide Web Consortium (W3C) in 1999 [P3P. Platform for Privacy Preferences Project (http://www.w3.org/P3P).]. P3P is a standard, which provides a simple and automated way for users to gain more control over their personal information when visiting web-sites. At its most basic level, P3P is a standardized set of multiple- choice questions, covering all the major aspects of a web-site’s privacy policies. Taken together, they present a clear snapshot of how a site handles personal information about its users. P3P-enabled web-sites make this information available in a standard, machine-readable format that P3P-enabled browsers can “read” automatically and compare it to the user’s own set of privacy preferences. P3P enhances user control by putting privacy policies where users can find them, in a form users can understand, and, most importantly, enables users to act on what they see.

Many web-sites provide a privacy statement or a P3P policy that the user can view with a browser. P3P helps protect the privacy of user’s personal information on the Internet by simplifying the process for deciding whether and under what circumstances personal information is disclosed to web-sites. However, while P3P provides a standard mechanism for describing privacy practices, it does not set a privacy standard which web-sites must follow. In Internet Explorer for instance the user can define his/her privacy preferences for handling cookies. So, when he/she browse to web-sites, Internet Explorer determines whether the sites provide P3P privacy information. For sites that provide this information, the browser compares user’s privacy preferences to the site’s privacy policy information. In this manner, Internet Explorer decides whether to allow cookies or restrict them. As an example, the user can choose to block cookies which use personally- identifiable information without his/her clear consent. A P3P-compliant web-site must provide a clear definition of its privacy policies.

Nowadays the new technologies/products for protecting user’s privacy on computers and networks are becoming increasingly popular. However, none can guarantee secure communications. So, electronic privacy issues in the foreseeable future will become highly crucial and intense.

In this framework, the full paper will present firstly some theoretical issues concerning the meaning and the significance of privacy. Then we will discuss the fine line between web personalization and personal intrusion and control when recording and modeling the user. The cases when this line is crossed will be also identified. The need for privacy has also created a market of products designed to protect user’s personal information. The solutions that technology can offer to website owners and visitors and the most well-known domain standards will be investigated. Finally, we will discuss open research issues since apart from the privacy threats that may emerge from the use of web mining for personalization, the same technology can also be deployed to identify privacy violations.

REFERENCES

Furnell, S.M. and Karweni, T. (1999). Security Implications of Electronic Commerce: a Survey of Consumers and Businesses. Internet Research, Vol. 9, No. 5, pp. 372-382.

GVU – Graphics, Visualization and Usability Center of the Georgia Institute of Technology (1996). 6th WWW User Survey ( http://www.gvu.gatech.edu/user_surveys/survey-10-1996/#highsum).

Liu, C. and Arnett, K. (2002). An Examination of Privacy Policies in Fortune 500 Web Sites. Mid-American Journal of Business. Vol. 17, No. 1, pp.13.

P3P. Platform for Privacy Preferences Project (http://www.w3.org/P3P).

Personalization Consortium (http://www.personalization.org).

Privacilla.org. Privacilla’s Two-Part Definition of Privacy ( http://www.privacilla.org/funda mentals/privacydefinition.html).

Volokh, E. (2000). Personalization and Privacy. Communications of the ACM, Vol. 43, No. 8, pp. 84-88, August 2000.