.. -*- mode: rst; fill-column: 78 -*-
.. ex: set sts=4 ts=4 sw=4 et tw=79:
  ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ###
  #
  #   See COPYING file distributed along with the PyMVPA package for the
  #   copyright and license terms.
  #
  ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ###

.. _intro:

************
Introduction
************

.. index:: MVPA

PyMVPA is a Python_ module intended to ease pattern classification
analysis of large datasets. It provides high-level abstraction of typical
processing steps and a number of implementations of some popular algorithms.
While it is not limited to neuroimaging data it is eminently suited for such
datasets. PyMVPA is truly free software (in every respect) and additionally
requires nothing but free software to run. Theoretically PyMVPA should run
on anything that can run a Python_ interpreter, although the proof is yet to
come.

PyMVPA stands for *Multivariate Pattern Analysis* in Python_.

.. _Python: http://www.python.org


What this Manual is NOT
~~~~~~~~~~~~~~~~~~~~~~~

.. index:: textbook, review, API reference, examples

This manual does not make an attempt to be a comprehensive introduction into
machine learning theory or pattern recognition techniques. There is a wealth
of high-quality text books about this field available. A very good example is:
`Pattern Recognition and Machine Learning`_ by `Christopher M. Bishop`_.

A good starting point to learn about the application of machine learning
algorithms to (f)MRI data are two recent reviews by Norman et al. [1]_ and
Haynes and Rees [2]_.

This manual also does not describe every bit and piece of the PyMVPA package.
For more information, please have a look at the `API documentation`_, which is a
comprehensive and up-to-date description of the whole package.

More examples and usage patterns extending the ones described here can be taken
from the examples shipped with the PyMVPA source distribution (`doc/examples/`)
or even the unit test battery, also part of the source distribution
(in the `tests/` directory).

.. _API Documentation: api/index.html
.. _Christopher M. Bishop: http://research.microsoft.com/~cmbishop/
.. _Pattern Recognition and Machine Learning: http://research.microsoft.com/~cmbishop/PRML

.. [1] Norman, K.A., Polyn, S.M., Detre, G.J. & Haxby, J.V. (2006). Beyond
       mind-reading: multi-voxel pattern analysis of fMRI data. Trends in
       Cognitive Science 10, 424–430.
.. [2] Haynes, J.D. & Rees, G. (2007). Decoding mental states from brain
       activity in humans. Nature Reviews Neuroscience, 7, 523–534.


.. _history:

.. index:: history, MVPA toolbox for Matlab, license, free software

A bit of History
~~~~~~~~~~~~~~~~

The roots of PyMVPA date back to early 2005. At that time it was a C++ library
(no Python_ yet) developed by Michael Hanke and Sebastian Krüger, intended to
make it easy to apply artificial neural networks to pattern recognition
problems.

During a visit to `Princeton University`_ in spring 2005, Michael Hanke
was introduced to the `MVPA toolbox`_ for `Matlab
<http://buchholz.hs-bremen.de/aes/aes_matlab.gif>`_, which had several
advantages over a C++ library. Most importantly it was easier to use. While a
user of a C++ library is forced to write a significant amount of front-end
code, users of the MVPA toolbox could simply load their data and start
analyzing it, providing a common interface to functions drawn from a variety
of libraries.

.. _Princeton University: http://www.princeton.edu
.. _MVPA toolbox: http://www.csbmb.princeton.edu/mvpa/

However, there are some disadvantages to writing a toolbox in Matlab. While
users in general benefit from the powers of Matlab, they are at the same time
bound to the goodwill of a commercial company. That this is indeed a problem
becomes obvious when one considers the time when the vendor of Matlab was not
willing to support the Mac platform. Therefore even if the MVPA toolbox is
`GPL-licensed`_ it cannot fully benefit from the enormous advantages of the
free software development model environment (free as in free speech, not only
free beer).

.. _GPL-licensed: http://www.gnu.org/copyleft/gpl.html

For these reasons, Michael thought that a successor to the C++ library
should remain truly free software, remain fully object-oriented (in contrast
to the MVPA toolbox), but should be at least as easy to use and extensible
as the MVPA toolbox.

After evaluating some possibilities Michael decided that `Python`_ is the most
promising candidate that was fully capable of fulfilling the intended
development goal. Python is a very powerful language that magically combines
the possibility to write really fast code and a simplicity that allows one to
learn the basic concepts within a few days.

.. index:: RPy, PyMatlab

One of the major advantages of Python is the availability of a huge amount of
so called *modules*. Modules can include extensions written in a hardcore
language like C (or even FORTRAN) and therefore allow one to incorporate
high-performance code without having to leave the Python
environment. Additionally some Python modules even provide links to other
toolkits. For example `RPy`_ allows to use the full functionality of R_ from
inside Python. Even Matlab can be used via some Python modules (see PyMatlab_
for an example).

.. _RPy: http://rpy.sourceforge.net/
.. _R: http://www.r-project.org
.. _PyMatlab: http://code.google.com/p/pymatlab/

After the decision for Python was made, Michael started development with a
simple k-Nearest-Neighbour classifier and a cross-validation class. Using
the mighty NumPy_ package made it easy to support data of any dimensionality.
Therefore PyMVPA can easily be used with 4d fMRI dataset, but equally well
with EEG/MEG data (3d) or even non-neuroimaging datasets.

.. index:: NIfTI

By September 2007 PyMVPA included support for reading and writing datasets
from and to the `NIfTI format`_, kNN and Support Vector Machine classifiers,
as well as several analysis algorithms (e.g. searchlight and incremental
feature search).

.. _NIfTI format: http://nifti.nimh.nih.gov/

During another visit in Princeton in October 2007 Michael met with `Yaroslav
Halchenko`_ and `Per B. Sederberg`_. That incident and the following
discussions and hacking sessions of Michael and Yaroslav lead to a major
refactoring of the PyMVPA codebase, making it much more flexible/extensible,
faster and easier than it has ever been before.

.. _Yaroslav Halchenko: http://www.onerussian.com/
.. _Per B. Sederberg: http://www.princeton.edu/~persed/


.. _requirements:
.. index:: requirements

Prerequisites
~~~~~~~~~~~~~

Like every other Python module PyMVPA requires at least a basic knowledge of
the Python language. However, if one has no prior experience with Python one
can benefit from the simplicity of the Python language and acquire this
knowledge within a few days by studying some of the many tutorials available
on the web.

.. links to good tutorials (numpy for matlab users, dive into python, ...)

As PyMVPA is about pattern recognition a basic understanding about machine
learning principles is necessary to correctly apply methods with PyMVPA to
ensure interpretability of the results.

.. index:: dependencies, Python, NumPy

Dependencies
''''''''''''

The following software packages are required or PyMVPA will not work at all.

  Python_ 2.4 (or later)
    With some modifications PyMVPA could probably work with Python 2.3, but as
    it is quite old already and Python 2.4 is widely available there should be
    no need to do this.
  NumPy_
    PyMVPA makes extensive use of NumPy to store and handle data. There is no
    way around it.

.. _NumPy: http://numpy.scipy.org/


.. index:: recommendations, SciPy, PyNIfTI, Shogun, R, RPy

Strong Recommendations
''''''''''''''''''''''

While most parts of PyMVPA will work without any additional software, some
functionality makes use of additional software packages. It is strongly
recommended to install these packages as well.

  SciPy_: linear algebra, standard distributions
    SciPy_ is mainly used by the statistical testing and the logistic
    regression classifier code. However, in the long run SciPy might be used a
    lot more and could become a required dependency of PyMVPA.
  PyNIfTI_: access to NIfTI files
    PyMVPA provides a convenient wrapper for datasets stored in the NIfTI
    format. If you don't need that, PyNIfTI is not necessary, but otherwise
    it makes it really easy to read from and write to NIfTI images.
  Shogun_: various classifiers
    PyMVPA currently can make use of several SVM implementations of the
    Shogun_ toolbox. It requires the modular python interface of Shogun to be
    installed. Any version from 0.6 on should work.
  R_ and RPy_: more classifiers
    Currently PyMVPA provides a wrapper around the LARS library.

.. _SciPy: http://www.scipy.org/
.. _libsvm: http://www.csie.ntu.edu.tw/~cjlin/libsvm/
.. _PyNIfTI: http://niftilib.sourceforge.net/pynifti/
.. _Shogun: http://www.shogun-toolbox.org


.. index:: suggestions, IPython, FSL, AFNI, libsvm

Suggestions
''''''''''''

The following list of software is not required by PyMVPA, but it might make
life a lot easier and leads to more efficiency when using PyMVPA.

  IPython_: frontend
    If you want to use PyMVPA interactively it is strongly recommend to use
    IPython_. If you think: *"Oh no, not another one, I already have to learn
    about PyMVPA."* please invest a tiny bit of time to watch the `Five Minutes
    with IPython`_ screencasts at showmedo.com_, so at least you know what you
    are missing.
  FSL_: preprocessing and analysis of (f)MRI data
    PyMVPA provides some simple bindings to FSL output and filetypes (e.g. EV
    files and MELODIC output directories). This makes it fairly easy to e.g.
    use FSL's implementation of ICA for data reduction and proceed with
    analyzing the estimated ICs in PyMVPA.
  AFNI_: preprocessing and analysis of (f)MRI data
    Similar to FSL, AFNI is a free package for processing (f)MRI data.
    Though its primary data file format is BRIK files, it has the ability
    to read and write NIFTI files, which easily integrate with PyMVPA.
  libsvm_: fast SVM classifier
    Only the C library is required and none of the Python bindings that are
    available on the upstream website. PyMVPA provides its own Python wrapper
    for libsvm which is a fork based on the one included in the libsvm package.
    Additionally the upstream libsvm distribution causes flooding of the console
    with a huge amount of debugging messages. Please see the `Building from
    Source`_ section for information on how to build an alternative version that
    does not have this problem.

.. _IPython: http://ipython.scipy.org
.. _Five Minutes with IPython: http://showmedo.com/videos/series?name=CnluURUTV
.. _showmedo.com: http://showmedo.com
.. _FSL: http://www.fmrib.ox.ac.uk/fsl/
.. _AFNI: http://afni.nimh.nih.gov/afni/


.. _obtaining:

Obtaining PyMVPA
~~~~~~~~~~~~~~~~

.. index:: binary package

Binary packages
'''''''''''''''

The easiest way to obtain PyMVPA is to use pre-built binary packages.
Currently the Debian/Ubuntu family is the only environment for which
binary packages are available (see below). If you manage to build PyMVPA
on Windows or OS X, we would be glad to hear from you.

.. index:: Debian

Debian
^^^^^^

PyMVPA is available as an `official Debian package`_ (`python-mvpa`;
since *lenny*). The documentation is provided by the optional
`python-mvpa-doc` package.

.. _official Debian package: http://packages.debian.org/python-mvpa

.. index:: backports, Debian, Ubuntu

Debian backports and inofficial Ubuntu packages
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Backports for the current Debian stable release and binary packages for recent
Ubuntu releases are available from a `repository at the University of
Magdeburg`_. Please read the `package repository instructions`_ to learn about
how to obtain them.

.. All of the PyMVPA developers have sworn a solemn oath to name their
   first-born child 'Debian'.


.. _repository at the University of Magdeburg: http://apsy.gse.uni-magdeburg.de
.. _package repository instructions: http://apsy.gse.uni-magdeburg.de/main/index.psp?sec=1&page=hanke/debian&lang=en


.. _buildfromsource:


.. index:: building from source, source package, MacOSX

Building from Source
''''''''''''''''''''

If a binary package for your platform and operating system is provided, you do
not have to build the packages on your own -- use the corresponding pre-build
packages instead. However, if there are no binary packages for your system you
can easily build PyMVPA on your own. Any recent linux distribution should be
capable of doing it. Additionally, we are aware of successful builds on Mac
OSX.

.. _PyMVPA project website: http://pkg-exppsy.alioth.debian.org/pymvpa/

.. index:: releases, development snapshot

The first step is obtaining the sources. The source code tarballs of all
PyMVPA releases are available from the `PyMVPA project website`_.
Alternatively, one can also download a tarball of the latest development
snapshot_ (i.e. the current state of the *master* branch of the PyMVPA source
code repository).

.. _snapshot:  http://git.debian.org/?p=pkg-exppsy/pymvpa.git;a=snapshot;h=refs/heads/master;sf=tgz
.. index:: Git repository

If you want to have access to both, the full PyMVPA history and the latest
development code, you can use the PyMVPA Git_ repository, which is publicly
available. To view the repository, please point your web browser to gitweb:

  http://git.debian.org/?p=pkg-exppsy/pymvpa.git

The gitweb browser also allows to download arbitrary development snapshots
of PyMVPA. For a full clone (aka checkout) of the PyMVPA repository simply
do:

  :command:`git clone git://git.debian.org/git/pkg-exppsy/pymvpa.git`

After a short while you will have a `pymvpa` directory below your current
working directory, that contains the PyMVPA repository.

.. _Git: http://git.or.cz/

To build PyMVPA from source simply enter the root of the source tree (obtained
by either extracting the source package or cloning the repository) and run:

  :command:`python setup.py build_ext`

If you are using a Python version older than 2.5, you need to have
python-ctypes (>= 1.0.1) installed to be able to do this.

Now, you are ready to install the package. Do this by invoking:

  :command:`python setup.py install`

Most likely you need superuser privileges for this step. If you want to install
in a non-standard location, please take a look at the :command:`--prefix`
option. You also might want to consider :command:`--optimize`.

Now you should be ready to use PyMVPA on your system.

.. index:: libsvm, SWIG

Build with enabled libsvm bindings
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

From the 0.2 release of PyMVPA on, the libsvm_ classifier extension is not
build by default anymore. However, it is still shipped with PyMVPA and can
be enabled at build time. To be able to do this you need to have SWIG_ and
the development files of libsvm_ (headers and library) installed on your
system. Depending on where you installed them, it might be necessary to
specify the full path to them with the `--include-dirs`, `--library-dirs`
and `--swig` options.

PyMVPA needs a patched libsvm version, as the original distribution generates
a huge amount of debugging messages and therefore makes the console and PyMVPA
output almost unusable. Debian (since lenny: 2.84.0-1) and Ubuntu (since gutsy)
already include the patched version. For all other systems it is easy to build
patched libsvm (see `Building patched libsvm from Source`_).

The command to build all extentions including the libsvm wrapper is::

  PYMVPA_LIBSVM=1 python setup.py build_ext --swig-opts="-c++ -noproxy"

The installation procedure is equivalent to the a build setup without libsvm_.


Building patched libsvm from source
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

First get the patched sources from::

  http://packages.debian.org/source/sid/libsvm

Download the `diff.gz` and the `orig.tar.gz` files offered at the bottom of the
page. Once downloaded extract the `tar.gz` file and patch it. The following
example refers to *libsvm* version 2.85.0, please adjust the filenames and
versions if you use a later version::

  tar xvzf libsvm_2.85.0.orig.tar.gz
  cd libsvm-2.85
  zcat ../libsvm_2.85.0-1.diff.gz | patch -p1

If `zcat` does not work for you (which might happen on Mac OSX), simply
decompress the diff manually and do::

  patch -p1 < ../libsvm_2.85.0-1.diff

instead to patch the sources. If this is done build the library and install
it::

  make libsvm.so.2.85.0
  DESTDIR=/usr/local make install

Set `DESTDIR` to your prefered installation path. For those running Mac OSX,
there is also a `Makefile.osx`.

.. _SWIG: http://www.swig.org
.. _libsvm: http://www.csie.ntu.edu.tw/~cjlin/libsvm/

.. Actually, AFAIK upstream libsvm does not easily allow for compiling a libsvm
   static or shared lib. Or am I wrong?


.. index:: alternative build procedure

Alternative build procedure
^^^^^^^^^^^^^^^^^^^^^^^^^^^

Alternatively, if you are doing development in PyMVPA or if you
simply do not want (or do not have sufficient permissions to do so) to
install PyMVPA system wide, you can simply call `make` (same `make
build`) in the top-level directory of the source tree to build
PyMVPA. Then extend or define your environment variable `PYTHONPATH`
to point to the root of PyMVPA sources (i.e. where you invoked all
previous commands from):

  export PYTHONPATH=$PWD

However, please note that this procedure also always builds the libsvm_
extension and therefore also required the patched libsvm version to be
available.


.. index:: installation

Installation
~~~~~~~~~~~~

.. Point to source and binary distribution. Preach idea of free software.
   Step by step guide to install it on difficult systems like Windows.

.. Don't forget to mention that the only reasonable way to use this piece
   of software (like every other piece) is under Debian! Also mention that
   Ubuntu is no excuse ;-)

If there are no binary packages for your operating system or platform yet, you
need to build from source. Please refer to `Building from Source`_ for more
information.

Otherwise just install the binary packages as you would do with any other
package. For example on Debian or Ubuntu simply do::

  sudo aptitude install python-mvpa


.. index:: citation, PyMVPA poster

How to cite PyMVPA
~~~~~~~~~~~~~~~~~~

The PyMVPA toolbox was first presented with a poster_ at annual meeting of the
*German Society for Psychophysiology and its Application* in Magdeburg,
2008. This is currently the prefered way to cite PyMVPA. However, we submitted
a paper introducing the toolbox, which should become replace the poster soon.

.. _poster: http://pkg-exppsy.alioth.debian.org/pymvpa/files/PyMVPA_PuG2008.pdf


Credits
~~~~~~~

(needs some more words, for now just a list)

  * NumPy, SciPy
  * libsvm
  * Shogun
  * IPython
  * Debian (for hosting, environment, ...)
  * FOSS community
  * Credits to individual labs if they officially donate time ;-)

.. Please add some notes when you think that you should give credits to someone
   that enables or motivates you to work on PyMVPA ;-)
