Selective Tuning with Fixation Control


The elegant aspect of this framework is that it cleanly shows the extent of Neisser’s (1967) - and then Wolfe’s (2000) - pre-attentive, attentive and post-attentive vision stages, Posner’s (1980) orienting, alerting and search functions of attention,  and the classic saliency map/master map of locations of Koch & Ullman (1985) and Treisman & Gelade (1980).



References


  1. Bruce, N., Tsotsos, J.K., Saliency Based on Information Maximization, NIPS 2005, Vancouver, BC

  2. Bruce, N., Tsotsos, J.K., Spatiotemporal Saliency: Towards a Hierarchical Representation of Visual Saliency, 5th Int. Workshop on Attention in Cognitive Systems, Santorini Greece, May 12, 2008.

  3. Bruce, N.D.B., Saliency, Attention and Visual Search: An Information Theoretic Approach, PhD Thesis, Dept. of Computer Science and Engineering, York University, Canada, July 2008.

  4. Itti, L., Koch, C., Niebur, E., (1998). A model of saliency-based visual attention for rapid scene analysis, IEEE Trans. PAMI, Vol. 20, p. 1254–1259.

  5. Koch, C., Ullman, S., (1985). Shifts in selective visual attention: towards the underlying neural circuitry. Hum. Neurobiol. 4, 219–227.

  6. Posner, M. I. (1980). Orienting of Attention, Quarterly Journal of Experimental Psychology 32, 1, 3–25.

  7. Treisman, A., Gelade, G. (1980). A feature integration theory of attention, Cognitive  Psychology 12: 97-136.

  8. Neisser, U. (1967). Cognitive Psychology Appleton-Century-Crofts New York
    Wolfe, J., Klempen, N, Dahlen, K. (2000).  Postattentive Vision, J. Experimental Psychology: Human Perception and Performance, Vol. 26, No. 2, 693-716

  9. Zaharescu, A., (2004). A Neurally-based Model of Active Visual Search, MSc Thesis, Dept. of Computer Science and Engineering, York University, Canada, July.

  10. Zaharescu, A., Rothenstein, A., Tsotsos, J.K., Towards a Biologically Plausible Active Visual Search Model Proc. ECCV WAPCV2004, Prague, May 15, 2004.

On the previous pages, ST has been demonstrated on images without eye movements. Bruce (2008) and Zaharescu (2005) provide a foundation for extensions to include eye movements. Almost all models that deal with fixations do so following Koch & Ullman (1985) or Itti et al. (1998); they employ a single saliency map encoding visual conspicuity of each image location. Although these works have proved useful and popular, newer efforts show better agreement with actual human eye movement patterns (Bruce & Tsotsos 2005; 2008). This better scheme has a sound foundation based on information theory and suggests that fixation is directed to image locations with maximum information value. Further, it uses a hierarchical decomposition of visual processing, as is present in biological systems and in ST.


However, it does not deal with providing eye movement commands nor does it determine when to move, when not to, nor maintain a history of fixation. Zaharescu (2005) and Zaharescu et al. (2004) developed such a system that directs a robotic camera in an active visual search task, and performs well in comparison with the same experiment with primate subjects. Interestingly, this addition leads to a novel explanation for how the common conception of a saliency map fits nicely within the overall ST framework.


The animation below shows the major steps of how the ST model marries with fixation control. The roots of this appear in Tsotsos et al. 1995 (section 2.3) and the Zaharescu work cited above. See those papers for required background.