Searchmight Lab

UCLA Advanced Neuroimaging Summer Program 2011

(adapted from material written by James Xiang and Francisco Pereira)

The overall goal of this lab is to show you how to use the Searchmight toolbox, which was developed for doing information mapping with classifiers trained on small voxel neighbourhoods (a.k.a. searchlights). The toolbox is written in MATLAB, but a lot of the functionality is also available in PyMVPA thanks to the coding prowess of Yaroslav Halchenko and Michael Hanke. Hence the focus is less on the specifics of the toolbox and more on how you can explore data with it.

The lab is written around a mock dataset similar to what one might acquire in a study of the McGurk effect. Our main reason for using synthetic data is that this way you will see how the tools work at finding complex structure that we know is there, and you should be able to test any of your own ideas for how to use them. There is no shortage of real data in the world with which you can challenge yourself later...

The lab has three parts:

  1. Data handling and visualization
  2. The mock McGurk effect dataset
  3. Data analysis

Before starting, please make sure that both Searchmight and some software we will use are in your path by typing the following commands in MATLAB

addpath('SearchmightToolbox');
setupPathsSearchmightToolbox;
addpath('SVM');


Data handling and visualization


Vectorization

In MATLAB, any matrix can be accessed as if it were a vector. To see how this works in practice, create a small random matrix

    a = randn(3,2);

and display it

    a

and now display it as if it were a vector

    a(:)

note the order in which the values appear: first column 1, then column 2. Let's see what happens in 3 dimensions

    b = randn(3,2,2);

If you type

   b

you'll see first

   b(:,:,1)

and then

    b(:,:,2)

If you then access b as if it were a vector, you'll see

    b(:)

Accessing a matrix in this manner is called "vectorization". While this might seem like a disrespectful way to treat the poor matrices, downgrading them to the status of mere vectors, it does have a few uses. Create a new matrix

    c = randn(3,3,3);

and display it

    c

If I asked you to create a new matrix d that is the same as c, except that positions (1,1,1),(2,2,2) and (3,3,3) have the value 1, you would do this by

    d = c;
    d(1,1,1) = 1;
    d(2,2,2) = 1;
    d(3,3,3) = 1;

and at the end you can type

    d(:)

to make sure.

Now what positions in d have become 1? We can use the MATLAB command find to figure out:

    find(d(:)==1)
    >>[1 14 27]

Now try this

    e = c;
    e([1 14 27]) = 1;
    e(:)

So this is useful not just for outputting the matrix but also for editing it! But how would one know that (2,2,2)=position 2 or (3,3,3)=position 3 ? The positions in a vector are called "indices", and 3D coordinates "subscripts". MATLAB provides functions to convert between indices and subscripts, rather unimaginatively sub2ind and ind2sub. If you wanted to figure out the index for (3,3,3), you would call, so

    index = sub2ind(size(c),3,3,3)

would return 27. Similarly,

    [x,y,z] = ind2sub(size(c),index)

returns x=3,y=3,z=3.

Knowing this, how would you obtain the indices for (1,1,1)/(2,2,2)/(3,3,3) with one call of sub2ind? And how would you obtain the coordinates for the three indices with one call of ind2sub? You can use

    indices = sub2ind(size(c),[1 2 3],[1 2 3],[1 2 3])
    [xs,ys,zs] = ind2sub(size(c),[1 14 27])

If you are comfortable with all of this, you can skip to the next section. Otherwise you can practice with the following exercise:

  1. load this MAT file,
    load('lab1.mat')
    in which you will find a 16*16*16 matrix A.
  2. Using vectorization, you can represent A as a vector. And you know that you can use max function to find out the maximum element in a vector. So, what's the value of the largest element in A?
  3. How many elements in A are greater than 3? What are their indices? What are their coordinate subscripts?
  4. Now we are going to modify these values (that are greater than 3) to be 1. We are going to use two methods to do this. The first method is to use the index and vectorization (A(i)=1). The second method is to use the coordinate subscripts (A(x,y,z)=1). Use the first method to modify one value to 1 and use the second method to modify the rest.
  5. Now check the maximum value in A again, what is it?
[click to display/hide solution]
max(A(:))

indices = find(A(:)>3)

[x,y,z] = ind2sub([16 16 16],indices);
disp([x,y,z])

A(indices(1)) = 1;

A(x(2),y(2),z(2)) = 1;
A(x(3),y(3),z(3)) = 1;

max(A(:))

Plotting with imagesc and subplot

Let's create something to plot, a random matrix

    X = randn(100,10)

of 10 variables and 100 datapoints of each, and a correlation matrix between the variables

    C = corrcoef(X);

You can visualize X using imagesc

    imagesc(X)

and add a colorbar with the data scale

    colorbar('vert');

Now let's look at the correlation matrix

    imagesc(C); colorbar('vert')

The correlation of each variable with itself is 1, and those are the values in the diagonal. The scale does not go as far in the negative direction. We can make it do that by passing a scale argument to imagesc

    imagesc(C, [-1 1]); colorbar('vert')

and now 0 corresponds to the green in the middle of the scale. Since we know it's a square matrix, also make it look like that

    axis square;

You can also make the colorbar horizontal

    colorbar('horiz');

Now let's see how to place multiple matrices in one figure. First create them

    X1 = randn(100,10); X2 = randn(100,10); X3 = randn(100,10); 
    X4 = randn(100,10);
    C1 = corrcoef(X1);  C2 = corrcoef(X2);  C3 = corrcoef(X3); 
    C4 = corrcoef(X4);

then we use the "subplot" command, which lays out multiple plot axes

    clf;
    subplot(1,4,1); imagesc(C1); axis square; colorbar('horiz');
    subplot(1,4,2); imagesc(C2); axis square; colorbar('horiz');
    subplot(1,4,3); imagesc(C3); axis square; colorbar('horiz');
    subplot(1,4,4); imagesc(C4); axis square; colorbar('horiz');

Note two things in each call to subplot: - the first two arguments to subplot indicate the grid layout (1 row and 4 columns) - the third argument indicates which subplot you will be plotting to.

If you are comfortable with all of this, you can skip to the next section. Otherwise you can practice with the following exercise:
  1. Modify the code above so that all 4 matrices have the same scale (hint: use a variable shared by all 4).
  2. Read the help for subplot and try to lay these 4 plots in a 2 x 2 grid, rather than a 1 x 4 one.
[click to display/hide solution]
scale = [-1 1];

clf;
subplot(1,4,1); imagesc(C1,scale); axis square; colorbar('horiz');
subplot(1,4,2); imagesc(C2,scale); axis square; colorbar('horiz');
subplot(1,4,3); imagesc(C3,scale); axis square; colorbar('horiz');
subplot(1,4,4); imagesc(C4,scale); axis square; colorbar('horiz');

and

clf;
nrows = 2; ncols = 2;
subplot(nrows,ncols,1); imagesc(C1,scale); axis square; colorbar('horiz');
subplot(nrows,ncols,2); imagesc(C2,scale); axis square; colorbar('horiz');
subplot(nrows,ncols,3); imagesc(C3,scale); axis square; colorbar('horiz');
subplot(nrows,ncols,4); imagesc(C4,scale); axis square; colorbar('horiz');

The CMU data format and the "meta" structure

The goal of this part is to acquaint you with the "meta" structure which accompanies datasets in the CMU format. This is used by both Searchmight and Simitar to speed up spatial computations, but is also handy in other ways. By now you've seen data formats such as nifti that store entire 3D volumes over time. This is somewhat wasteful, as we are often interested in a fraction of those voxels (e.g. those in a mask covering cortex).

The CMU format stores data as an array with #examples x #voxels in a mask, similarly to what is done in the MVPA toolbox for each mask in a dataset, together with a "meta" structure encoding the 3D positions of voxels and their neighbours. The format was developed independently, and the goal was to make it as easy to manipulate data as MATLAB matrices as possible. Both Searchmight and Simitar expect data in this format, but below you will see how easy it is to transform nifti files or similar formats.

The best way of introducing this is with an example. Let's say we a 4D dataset with a 3D volume for each of 10 time points, and each 3D volume has 4x4x4 voxels

    dataset = randn(4,4,4,10);

Even though these are little volumes, we would like to look at the tiny 2x2x2 brain inside, as defined by this mask

    mask = zeros(4,4,4);
    mask(2:3,2:3,2:3) = 1;

If you want to see what it looks like, you can disp(mask) or

    clf;
    for i = 1:4
        subplot(1,4,i); imagesc(mask(:,:,i),[0 1]); axis square;
    end

The "meta" structure is produced for a mask over the 3D volume that we would like to use. It will contain mappings between 3D and vectorized coordinates, as well as information about which voxels are spatially adjacent to each voxel. Let's start by creating one

    meta = createMetaFromMask(mask);

If you type

   meta

you'll see various fields. For now, I'd like you to look at

   meta.dimensions
   meta.dimx
   meta.dimy
   meta.dimz

The first field has the dimensions of the imaging volume (3D), where as the other three correspond to dimensions(1), dimensions(2) and dimensions(3), respectively, and are provided for quick reference. Then we have

    meta.indicesIn3D - m element vector

These are just the linear indices in a 4 x 4 x 4 matrix of the voxels that have value 1 in the mask. If you type

    mask(meta.indicesIn3D)

you should thus get a vector with all 1s (since we are only picking the voxels which are marked with 1 in the mask). This can also be used to get the values of voxels (inside the brain mask) from any volume in the dataset, e.g.

   volume = dataset(:,:,:,1);
   values = volume(meta.indicesIn3D);

At this point, I would like you to create a matrix of examples containing only the mask (brain) voxels in our small dataset, which is the kind of input that both Searchmight and Simitar use. This matrix has dimensions # volumes (or examples) x # voxels inside the mask. To do this, you can use

    nvoxels  = length(meta.indicesIn3D);
    examples = zeros(10,nvoxels);
    for i = 1:10
       volume = dataset(:,:,:,i);
       examples(i,:) = volume(meta.indicesIn3D);
    end

As a sanity check, you can look at

    examples(1,:)

and make sure the appropriate voxels in dataset(:,:,:,1) are picked up.

Now let's go over the other fields in meta, which are used to relate a voxel's position in 3D to the corresponding column number in the examples matrix

    colToCoord - #voxels x 3 matrix 

[vector] = meta.colToCoord(i,:) places the 3D coordinates of voxel i into [vector]

    coordToCol - 4 x 4 x 4 matrix

i = meta.coordToCol(x,y,z) gives the column (i) of the voxel with 3D coordinates x, y and z
If i = 0 then the voxel is not in the data matrix.

To extract from examples the vector of 10 values (taken across 10 time points) by the voxel with coordinates (2,2,2), you can use

    voxel = meta.coordToCol(2,2,2);
    values = examples(:,voxel)

Finally, the two remaining fields; these are meant to speed up the process of determining which voxels are immediately adjacent to a given voxel (and thus in its "searchlight" of radius 1). It's possible to create meta structures for larger neighbourhoods (with more than just the immediately adjacent voxels), but that won't be necessary today.

    meta.numberOfNeighbours - #voxels x 1

meta.numberOfNeighbours(i) gives the number of non-zero neighbours voxel i has. (Non-zero voxels are those included in the brain mask.)

    meta.voxelsToNeighbours - #voxels x [(2*radius+1)^3 - 1]

meta.voxelsToNeighbours(i,1:meta.numberOfNeighbours(i)) gives the column indices of the non-zero neighbours of voxel i. Since radius = 1, the searchlight is a cube of 3 x 3 x 3 = 27 voxels, including the voxel i. Therefore, it will have 26 possible neighbours. However, if voxel i is on the edge of the brain (or selected region), some of its neighbours will not be included in the mask and will have index 0.

If you feel comfortable with this skip to the next section, otherwise try the following exercise.

  1. how many neighbours does voxel (2,2,2) have?
  2. which voxels are neighbours of voxel (2,2,2) (column #s in example)
[click to display/hide solution]
voxel  = meta.coordToCol(2,2,2);
number = meta.numberOfNeighbours(voxel);
neighbours = meta.voxelsToNeighbours(voxel,1:number)


The mock McGurk effect dataset


The searchlight analyses in this lab will be carried out on synthetic data where the ground truth is known. In this way you will know what structure you would like to find in the data and gain a better appreciation of what the tools can and cannot do for you. The following slide describes the task in this experiment. (MC is the McGurk effect.)

The mock brain in this dataset is a single 2D slice, with 4 regions of interest:

1 an area that will always be activated whenever the subject is performing a task

2 a visual area

3 an auditory area

4 a "perception area"

The visual/auditory/"perception" areas contain different patterns of activation, rather than a uniform blob. The experiment has 4 conditions, BA/DA/GA and MC (McGurk), and each of these will give rise to different patterns of activation in the three areas. Note that we created the dataset in such a way that the following hypothesized structure is true: the McGurk condition shares the pattern over one area (visual) with BA, over another (auditory) with DA and over another (perception) with GA.

The dataset is created by generating one template brain image per condition -- you will be given this below -- and then adding spatially correlated noise to the templates to generate various examples of the template.

Each example is a 2D image but it can be turned into a 1D vector. This slide shows one example from class BA being transformed into a vector, which ends up as part of an array of examples (10 examples per condition).

In the following section, you will be training searchlight classifiers on every small patch of "cortex" to distinguish all 4 conditions

Load the mock dataset in this MAT file

load('lab3.mat')
and do the following exercise

[click to display/hide solution]
160 examples, 1024 voxels, 4 conditions (using unique(labels))

unique(labelsGroup)

for ig=1:2
  for ic = 1:4
     length(find((labels==ic & labelsGroup==ig)))
  end
end

unique(meta.numberOfNeighbours)

3,5 or 8, because a voxel is in a corner, an edge or surrounded by voxels

Finally, we will plot the examples in 3D so that you have a means of plotting information maps later. Try the following code to compute the average example in each condition and plot it.

% compute the mean example in each condition
for i = 1:4
  indices = find(labels == i); % find examples with that condition
  meanExample{i} = mean(examples(indices,:),1); % average them
end

% place it in a 3D volume
volume = repmat(NaN,meta.dimensions);
for i = 1:4
  volume(meta.indicesIn3D) = meanExample{i};
  meanVolume{i} = volume;
end

% plot
clf; nrows = 1; ncols = 4;
idx = 1;
for i = 1:4
  subplot(nrows,ncols,idx);
  imagesc(meanVolume{i},[-5 5]);
  axis square;
  xlabel(sprintf('%s\n',meta.conditions{i}));
idx = idx + 1;
end


Data Analysis


Now that you are acquainted with the synthetic data and are comfortable visualizing it, we can proceed to analysing it. The goal of this section is to give you a sense of the information that GLM, a whole-brain classifier and a searchlight accuracy map can extract from the same dataset.

GLM

In this dataset we don't have time series but just one image for each trial in a condition (think of it as if we had averaged all the images in a trial into a single one). Hence the closest approximation to a GLM is a linear regression of each voxel on binary regressors indicating what condition each example belongs to (the regressors are already included in the mat file loaded earlier).

The following code uses the MATLAB function "regress" to regress the values of each voxel across examples from the "regressors" matrix; it yelds "betas", a 4x1024 voxel matrix of regression coefficients.
  betas = zeros(4,1024);
  for v = 1:1024
    [betas(:,v)] = regress(examples(:,v),regressors);
  end

Whole-brain classifier

A 4-way classifier -- to distinguish our 4 conditions -- can be constructed out of 4 one-condition-versus-the-rest classifiers. The following code trains 4 one-condition-versus-the-rest linear SVM classifiers on the entire dataset and stores the weights assigned to each voxel in "weights", a 4x1024 voxel matrix.

  weights = zeros(4,1024);
  for ic = 1:4
    mask = (labels ~= ic);
    labelsHere = mask + 1; % 1 if condition , 2 otherwise
    % train classifier
    [model{ic}] = classifierLIBSVMwrapper(examples,labelsHere,'kernel','linear');
    weights(ic,:) =  model{ic}.Wfeatures(:,1);
 end

Putting them together

We will now create a 3 x 4 plot where the first row has the average activation pattern for each condition, row 2 has the condition GLM beta weights and row 3 has the linear SVM weights.
volume = repmat(NaN,meta.dimensions);

clf; nrows = 3; ncols = 4;
idx = 1;
for i = 1:4
  volume(meta.indicesIn3D) = meanExample{i};
  subplot(nrows,ncols,idx); imagesc(volume,[-5 5]); axis square;
  title(sprintf('activation %s',meta.conditions{i}));
  idx = idx + 1;
end
for i = 1:4
  volume(meta.indicesIn3D) = betas(i,:);
  subplot(nrows,ncols,idx); imagesc(volume,[-5 5]); axis square;
  title(sprintf('GLM betas %s',meta.conditions{i}));
  idx = idx + 1;
end
for i = 1:4
  volume(meta.indicesIn3D) = weights(i,:);
  subplot(nrows,ncols,idx); imagesc(volume,[-0.05 0.05]); axis square;
  title(sprintf('SVM weights %s',meta.conditions{i}));
  idx = idx + 1;
end

Note how only the always-on top left ROI has uniform weights in the GLM map; that ROI disappears in the SVM weight map, as it does not distinguish any condition from the others. Note also how weights in 1 of the 3 ROIs appear subdued in each SVM 1-vs-rest weight map; that's the ROI with a similar activation pattern for each condition and MC. For MC-vs-rest, all 3 rois are used.

Searchmight

Information mapping with an accuracy map is different from the GLM or a whole brain classifier, in that we will be considering the accuracy of a classifier over a voxel searchlight rather than regression or classifier weights.

Multi-way accuracy map

The main function in Searchmight is computeInformationMap. It produces The following command will output an accuracy map and a p-value map for a 4-way discrimination in each 3x3x3 voxel searchlight, using a linear SVM classifier
[accuracyMap,pvalueMap] = computeInformationMap(examples,labels,labelsGroup,'svm_linear',...
                                                'searchlight',meta.voxelsToNeighbours,meta.numberOfNeighbours);

to generate the maps. The arguments in this case are Searchmight has documentation on all the arguments and options that that can be provided.

Now let's look at the accuracyMap plotted in 3D

clf;
volume = repmat(NaN,meta.dimensions);
volume(meta.indicesIn3D) = accuracyMap;
imagesc(volume,[0 1]); axis square; colorbar('vert')

Note how the accuracy is never higher than ~0.75; this is because there are always at least two conditions with the same patterns of activation in any searchlight (which two depends on the ROI). Note also how the high accuracy spots are larger (5x5) than the activation patches (3x3) in each ROI; this is because the searchlights for voxels surrounding a patch still contain voxels that are inside the patch.

Now let's see how we can make use of the p-value map, by thresholding it to show which locations have significant accuracy under the null hypothesis that the classifier is performing at chance Notice that the call to computeInformationMap outputs both an accuracy map and a corresponding p-value map. We can use the latter to determine where the accuracy is significant under the null hypothesis that the classifier is performing at chance level (the p-value is the probability that the accuracy could have been equal to or greater than what was observed, if that were the case).

% plot the map of voxels where p-values are less than 0.01
significantMap = (pvalueMap<=0.01);
volume = repmat(NaN,meta.dimensions);
volume(meta.indicesIn3D) = significantMap;
clf; imagesc(volume,[0 1]); axis square;

% and now do it using False Discovery Rate
[discard,thresholdN] = computeFDR(significantMap,0.01);
significantMap = (pvalueMap<=thresholdN);
volume(meta.indicesIn3D) = significantMap;
clf; imagesc(volume,[0 1]); axis square;
Searchmight has a function to threshold maps using False Discovery Rate (FDR) with a given q. The interpretation of a p-value map thus thresholded is that a fraction q of the voxels flagged as significant are false discoveries.

Pairwise accuracy map

Searchmight can also be used to generate pairwise accuracy maps. The main idea is to consider each pair of classes -- 6 total, in this case -- and compute an accuracy map for each pairwise distinction. This provides a finer grained view of what can be distinguished where. The computeInformationMap function can also be used for this purpose, with the 'computePairmaps' argument

[pairMaps,pvalueMaps,optionalReturns] = computeInformationMap(examples, labels,labelsGroup,'svm_linear', ...
'searchlight',meta.voxelsToNeighbours,meta.numberOfNeighbours,'computePairmaps');
If you do a "whos", you'll see that pairMaps contains 6 rows, rather than one, and so does pvalueMaps. But what can we do with this? We'll start with a coarse thresholding of pvalueMaps.
significantMaps = (pvalueMaps <= 0.001);
clf; imagesc(significantMaps);

You can look at each row as a map of significantly accurate locations for each of the 6 pairwise distinction, with as many columns as voxels. You can also look at each column as a profile of the searchlight centered around a voxel, indicating which pairwise distinctions it can nor cannot support; note that there aren't that many different profiles. You can turn a 6-number "profile" into a 4x4 matrix using the MATLAB function squareform, and then plot it, e.g. for the searchlight in voxel 429

matrix = squareform(significantMaps(:,429));
clf; imagesc(matrix), axis square;
set(gca,'YTick',[1 2 3 4]); set(gca,'YTickLabel',{'BA','DA','GA','MC'});
set(gca,'XTick',[1 2 3 4]); set(gca,'XTickLabel',{'BA','DA','GA','MC'});

Entry (i,j) in this matrix is 1 if conditions i and j can be distinguished in this searchlight, and 0 otherwise. We can look at this systematically by considering all the profiles, sorted by how often they appear.

We will start by finding all the different profiles in the dataset and counting them; these are all the unique columns in matrix significantMaps.

% get all the different columns there are

uniqueProfiles = unique(significantMaps','rows'); % note that it is transposed
[npairs,nvoxels] = size(significantMaps);

% find all unique profiles and count them

for ip = 1:size(uniqueProfiles,1)
   % find profile
   profile = uniqueProfiles(ip,:);

   % count by replicating it to all voxels and finding voxels where it matches
   tmp = repmat(profile',[1 nvoxels]);
   profileCount(ip) = sum(sum(significantMaps == tmp,1) == npairs);
end
And now we can plot these and see, at a glance, all the different local relationships between patterns of activation across conditions
clf; nrows = 3; ncols = 5;

% sort profiles by count
[discard,order] = sort(profileCount,'descend');

for ip = 1:size(uniqueProfiles,1)
  subplot(nrows,ncols,ip);

  % turn it into a matrix
  profile = order(ip);
  matrix = squareform(uniqueProfiles(profile,:));
  axis square;

  subplot(nrows,ncols,ip);
  imagesc(matrix,[0 1]); axis square;
  set(gca,'XTick',[]); set(gca,'YTick',[]);
  xlabel(sprintf('%d',profileCount(profile)));
end

Note the first four, most frequent profiles (the counts are under them): one corresponds to searchlights where nothing can be reliably distinguished, the following three correspond to searchlights where all conditions are distinguishable except for MC and each of the other 3 (depending on the ROI the searchlights are in).