Exercises rating:

★☆☆ - You should be able to based on Python knowledge plus the text.

★★☆ - You will need to do extra thinking and some extra reading/searching.

★★★ - The answer is difficult to find by a simple search, requires you to do a considerable amount of extra work by yourself (feel free to ignore these exercises if you're short on time).

In [ ]:

```
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
plt.style.use('seaborn-talk')
from sklearn.cluster import AgglomerativeClustering, MiniBatchKMeans
from sklearn.datasets import load_digits
from sklearn.metrics import v_measure_score
from sklearn.manifold import TSNE
from sklearn.decomposition import PCA
from sklearn.pipeline import make_pipeline
digits = load_digits()
```

Make a pipeline and join PCA and k-means into a single model. Does the v-measure improves after the use of linear preprocessing?

In [ ]:

```
```

Now use t-SNE as the preprocessing. Does the v-measure improves after the use of non-linear preprocessing?

Note that the t-SNE implementation of `sklearn`

is incomplete.
It does not have a plain `transform`

method
and is not applicable beyond the data for which it is `fit`

.
This is not a problem for us who are only exploring the
non-linearity of the digits dataset.
Instead of using plain `TSNE`

in your pipeline use the class defined below (remember to execute this cell).

In [ ]:

```
class PipeTSNE(TSNE):
def transform(x):
return self.fit_transform(x)
```

In [ ]:

```
```

Use `linkage='ward'`

for the time being.

In [ ]:

```
```

Remember to use the `PipeTSNE`

defined above.
Keep `linkage='ward'`

in this exercise.

In [ ]:

```
```

Remember to use the `PipeTSNE`

defined above.
Now it is time to use `linkage='single'`

in the agglomerative clustering.
Does single linkage perform better on the non-linearly preprocessed dataset
than it did when we saw it performed on the raw data of the digits dataset?

In [ ]:

```
```