Privacy-preserving Data Mining for Personalized Marketing

Yong Jick Lee


In electronic commerce markets, firms can easily achieve customers’ personal information such as identity, demographic information, and shopping behavior. Researches in securing statistical database had introduced several tools and methods to secure such statistical database with sensitive personal information. Although the data perturbation methods secure the database very effectively, it is not applicable to the application beyond the simple statistical analysis on means, variances and covariance. In this paper, I suggest Multiple Staged Slice Perturbation Methods in order to apply them to RFM analysis. My study shows the possibility of applying a simple modification to perturbation methods in order to be able to perform the RFM analysis. My method of slicing the database into decile and perturbing each decile separately would maintain the mean and the standard deviation of each decile. I showed that current data security methods may not be applicable to some business analysis that deals with more than the mean, standard deviation and covariance between variables. Since perturbation method guarantees protection against exact disclosure, there is no threat of exact disclosure even if data is partitioned into small pieces and perturbed individually. However, because partitioning limits range of shuffling effect, partial disclosure is possible. Therefore, for achieving the maximum utility while preserving maximum security level, the number of partition should be minimized.


Data Mining, Data Perturbation, Privacy-preserving database, Statistical Database, Database Security.

Full Text:



Adam, N. R. and Wortmann, J. C. “Security-Control Meth-ods for Statistical Databases: A Comparative Study,” ACM Computing Surveys, vol. 21, no. 4, pp. 515-556, 1989.

Bitran, G. R. and S. V. Mondschein. “Mailing Decisions in the Catalog Sales Industry,” Management Science, vol. 42, no. 9, pp. 1364-1381, Sept. 1996.

Chevalier, J. and Goolsbee, A. “Price competition online: Amazon versus Barnes and Noble,” Quantitative Marketing and Economics, vol. 1, no. 2, pp. 203-222, 2003.

Chin, R. Y. and G. Ozsoyoglu. “Auditing and Inference Con-trol in Statistical Databases,” IEEE Transactions on Software Engineering, vol. 8, no. 6, pp. 574-582, Nov. 1982.

Desai, M. S. and Richards, T.C. and Desai, K. J. “E-commerce policies and customer privacy,” Information Man-agement & Computer Security, vol. 11, no. 1, pp. 19-29, 2003.

Hughes, A.M. “Boosting Response with RFM,” American Demographics, May 1996, pg. 4.

Hughes, A.M. “Strategic Database Marketing, 2nd ed.,” McGraw-Hill. New York, NY. 2000.

Muralidhar, K., R. Parsa and R. Sarathy, “A General Addi-tive Data Perturbation Method for Database Security,” Man-agement Science, vol. 45, no. 10, pp. 1399-1431, 1999.

Gangopadhyay, A. and Ahluwalia, M. “Preserving Privacy in Mining Association Rules,” The Second Secure Knowledge Management Workshop (SKM), Brooklyn, New York, 2006.

Gopal, R.D. and Goes, P.B. and Garfinkel, R. S. “Interval Protection of Confidential Information in a Database,” IN-FORMS Journal on Computing, vol. 10, no. 3, pp. 309-322, 1999.

Drozdenko, R.G. and Drake P.D. “Optimal Database Mar-keting,” Sage Publications, Inc., Thousand Oaks, CA, 2002.

Wang, H., K. O. Lee and C. Wang. “Consumer Privacy Con-cerns about Internet Marketing,” Communications of The ACM, vol. 41, no. 3, pp. 63-70, March 1998.

Wilson, R. and Rosen, P.A. “Protecting data through 'pertur-bation' techniques: The impact on knowledge discovery in databases,” Journal of Database Management, vol. 14, no. 2, pp. 14-26, 2003.


  • There are currently no refbacks.