Stein estimation

Fondazione Bruno Kessler - Technologies of Vision

contains material from
Template Matching Techniques in Computer Vision: Theory and Practice
Roberto Brunelli � 2009 John Wiley & Sons, Ltd

[prev] [prev-tail] [tail] [up]

3.3 Stein estimation

As discussed at length in Chapter TM:3, an accurate estimate of probability distributions is key to successfull hypotheses testing in general and template matching in particular. Even when coping with the simple case of normally distributed patterns, the estimation of the correct probability distribution parameter from experimentally avaialble data poses some challenges. The key quantity to be estimated in this case is the covariance matrix, which, together with the mean, completely characterize the distribution.

Codelet 4 On covariance estimation errors (R/tm.covarianceImpact.R)
____________________________________________________________________________________________________________

We want to visualize the impact of errors in the estimation of the covariance matrix on P_D, the detection probability, at different operating conditions P_F, as typical in the Neyman-Pearson paradigm. We refer to a single dimensional case, assuming that the distance of class means is 1: the difficulty of the problem is changed by changing the (common) standard deviation σ describing the two distributions. As easily checked from the results of Section TM3.3, we have that σ₀ = σ^-1. We want to compute the impact on P_D(σ₀) given that we set the operating condition using P_F(σ′), where σ′ is our estimate of σ₀ We first need to define a few functions, corresponding to Equation TM:3.15,

1tm.Q <- function(x, sd = 1.0) {
2 1 - pnorm(x/sd)
3}

to Equation TM:3.57,

4tm.Pf <- function(nu, sd0 = 1.0) {
5 tm.Q(nu / sd0 + sd0/2)
6}

and to Equation TM:3.58,

7tm.Pd <- function(nu, sd0 = 1.0) {
8 tm.Q(nu / sd0 - sd0/2)
9}

By choosing σ = 1∕3, from which σ₀ = 3, we get a reasonable testing case:

10tm.covarianceImpact <- function(sigma0 = 3,

We then consider P_F [0.01, 0.3]

11 pfRange = c(0.01, 0.3, 0.01),

and a moderately large range for σ′ = ασ₀ α [0.8, 1.2]:

12 pcRange = c(0.8, 1.2, 0.025)) {

We generate the sampling sequences:

13  ats <- seq(-3*sigma0, 3*sigma0, 0.1)
14  pfs <- do.call(seq, as.list(pfRange))
15  pcs <- do.call(seq, as.list(pcRange)) * sigma0

and determine their lengths

16 npfs <- length(pfs)
17 npcs <- length(pcs)

from which we appropriately size the map:

18 ci <- array(0, dim=c(npfs, npcs))

We now hypothesize several estimated values σ′,

19 for(c in 1:npcs) {

and for each of them, we build the function nu ν = ν(x) : P_F(ν) = x:

20 nu <- splinefun(tm.Pf(ats, sd0=pcs[c]), ats)
21 for(f in 1:npfs) {

We can now compute for a selected subset of operating conditions P_F(σ′) based on our estimated standard deviation, the difference in the miss probability with respect to the correct one P_D(σ₀):

22      ci[f,c] <- (1-tm.Pd(nu(pfs[f]), pcs[c])) /(1-tm.Pd(nu(pfs[f]), sigma0))-1
23    }
24  }
25  list(pfs, pcs, ci)
26}

______________________________________________________________________________________

1  source("R/tm.covarianceImpact.R")
2  sigma0 <- 3
3  ci     <- tm.covarianceImpact(sigma0)
4  tm.dev("figures/covarianceImpact")
5  persp(ci[[1]], ci[[2]]/sigma0, ci[[3]], theta = 150, phi = 10,
6 ...         shade = 0.9, expand = 0.75, r = 3, lwd=0.1,
7 ...         ticktype="detailed",cex=0.5, tcl=-0.5,
8 ...         xlab="false alarm rate", ylab="relative std. dev.",
9 ...         main="Relative change of miss probabilities", zlab="")
10  dev.off()

Figure 3.3:

Small errors in the estimation of the parameters of the probability ditributions can have a significant impact on classification performance. The plot shows how the error in the estimation of the standard deviation of normally distributed data results in an amplified detection error.

1 # load support for shrinkage covariance estimation
2 #
3 require(corpcor)
4 #
5 source("R/tm.shrinkageAdvantage.R")
6 # the number of samples
7 ns <- c(4,8,16,32,64,128,256,512)
8 # the sample space dimension
9 ps <- c(4,8,16,32,64,128,256,512)
10 #
11 shr <- tm.shrinkageAdvantage(ps, ns, ne = 10, ss = 10)
12 #
13 tm.dev("figures/shrinkageAdvantage")
14 #
15 persp(ns, ps, log(shr[[1]]/shr[[2]]),
16 ...       theta = 50,  phi = -10, shade = 0.5,
17 ...       ticktype="detailed", xlab="n", ylab="p",
18 ...       zlab="log(error Frobenius norm)", d=2)
19 #
20 dev.off()

Figure 3.4:

Shrinkage covariance estimation, based on the James-Stein insight, significantly outperforms the ordinary maximum likelihood estimator. The advantage increases with pattern space dimensionality p, and inversely to the number of samples.

[prev] [prev-tail] [front] [up]