apply spectral swapping algorithm to data

spectralswap.py

Author:Staal A. Vinterbo
Copyright:2009-2012 Staal A. Vinterbo
Version:generated for 0.1
Availability:GPL
URI:spectralswap.py, dummify
Contents:

Module Synopsis

Let m be a data matrix (list of rows) where columns represent attributes and rows represent the attribute values for objects. Also let which be a list of column indices that contains categorical data. Then:

from dummify import maxi
from spectralswap import spectralswapping
d = spectralswapping(m, which, maxi)

produces a matrix d that essentially is m with permuted columns. The function:

spectralswapping

depends on the module dummify.

To generate a html version of this short explanation:

$ python spectralswap.py -e | rst2html > explanation.html

rst2html is a part of the python docutils package http://docutils.sourceforge.net/docs/

Theory

In order to preserve inter-column relationships, the actual permutation is done along eigenvectors. Details can be found in:

Thomas A. Lasko, Staal A. Vinterbo, "Spectral Anonymization of Data,"
IEEE Transactions on Knowledge and Data Engineering, pp. 437-446, March, 2010

http://doi.ieeecomputersociety.org/10.1109/TKDE.2009.88

Note

This implementation was supported by NIH NLM grant 7R01LM007273-07 and NIH Roadmap for Medical Research grant U54 HL108460.