In Memoriam: Mark Everingham

MARK EVERINGHAM was a brilliant colleague. You may have been aware of him at conferences where he asked penetrating questions that could crystallize a key aspect of a paper. In conversation with him you might have been stunned by a new connection he made between areas of research or to a crucial related work. These questions and observations were a reflection of his very broad knowledge and deep understanding of computer vision and machine learning. To us, they demonstrated his intellect and insight, but to him they were just a way of being helpful, a way of ensuring that the field made progress. Mark was incredibly generous with his time. Those of us that worked with him are aware of how much he contributed behind the scenes, without expecting any recognition. Nowhere is this more apparent than in the organization of the PASCAL Visual Object Classes (VOC) challenge to which he devoted colossal amounts of time and effort. There are also the more visible contributions to the community in area chair duties at both CVPR and ECCV, as a program cochair for BMVC, and as a member of the TPAMI editorial board. Everything he did: research, experimentation, software, paper writing, talks, was of the highest standard and a testament to his intellectual stamina. He was kind and demonstrated a gentle, dry wit that made time spent with Mark both stimulating and enjoyable. Mark was born in Bristol in 1973, winning a scholarship to Clifton College, and completing his A levels at Filton College in 1991. Directly after school he worked on a research project for the Bristol Eye Hospital, developing software for remote electrodiagnosis. He continued his involvement with the project after heading to the University of Manchester to study computer science, winning prizes for top achievement every year, and was duly awarded the BSc with 1st class honors and the Williams-Kilburn medal for exceptional achievement in 1995. Returning to Bristol, he completed work on the electrodiagnosis project, leading to his first publication, in the journal Electroencephalography and Clinical Neurophysiology in 1996. In 1997 he began his doctoral studies at the University of Bristol, supervised by Barry Thomas and Tom Troscianko, on mobile augmented reality aids for people with severe visual impairments. By presenting an enhanced image to the wearer’s visual system, users with low vision would be freed from the need for external assistance in many tasks. The approach Mark took was, as usual, based on a deep rethinking of the problem. Rather than attempting to enhance the image by emphasizing edges, he proposed to identify the semantic content of an image such that images could be enhanced in a content-driven manner, and to enhance regions rather than edges. This work led him to look at region segmentation algorithms, and there he discovered the difficulties of evaluating computer vision algorithms, leading to some of the most significant papers from his PhD. In particular, “Evaluating Image Segmentation Algorithms Using the Pareto Front,” presented at ECCV 2002, showed the importance of the choice of evaluation metric in a compelling way, and is notable for its inclusion of the “embarrassingly simple” baseline method of dividing the image into blocks, which sometimes appears on the Pareto front. In presentations of this work, Mark would draw out the humor in this fact, but also use it as a point of reference to illustrate the behavior of metrics, to give insight into the criteria, and ultimately to convince you that you had learned something. Graduating with his PhD in 2002, Mark moved to Andrew Zisserman’s group at Oxford University’s Department of Engineering Science, where he worked on three projects which explored the level of supervision required for visual classification and detection tasks. The first aimed to detect and identify actors in relatively low resolution video footage, such as TV material from the 1970s. It was demonstrated on the situation comedy Fawlty Towers. The method involved quite strong supervision where a 3D head and face model were built for each character (from images). These 3D models were then used to render images to train a discriminative tree-structured classifier which was then used as a sliding window detector. This person-specific IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 34, NO. 11, NOVEMBER 2012 2081