Evaluation of algorithms and systems is a complex task. This chapter addresses four related questions that are important from a practical and methodological point of view: what is a good response of a template matching system, how can we exploit data to train and at the same time evaluate a classification system, how can we describe in a compact but informative way the performance of a classification system, and, finally, how can we compare multiple classification systems for the same task in order to assess the state of the art of a technology.
keywords: ROC analysis, technology evaluation, classifier training,
cross-validation, one-leave-out, bootstrap.