Skip to content

Eliciting judgements of value with visual judgement cues

19 August 2013

Derek Thomson

WIRED has an interesting review of an MIT project attempting to elicit judgements of safety from visual cues of the urban environment.  It looks (from the paired comparisons).  As below, they’re talking randomly-paired images from Google Maps Streetview and asking random participants to judge which is more desirable.  From this, they’re calculating heatmaps of ‘safety’ but crunching the extensive dataset they’ve gathering.  Problem is; they’re not acknowledging the weaknesses of the elicitation prompt itself.


The elicitation prompt of this work is remarkably similar to “VALiD: Value in Design” – a past study of ours at Loughborough work that sought to provide a practical (rather than necessarily perfectly reliable) way of helping construction project stakeholders evaluate architectural options for addressing architectural criterion that they considered aspects of “value” for their project.  These judgements were structured as evaluations of “benefits” or “sacrifices” – getting to the core of the nature of value: the trade-off between desired and undesired outcomes.  This operationalised expression of such an obtuse concept was one of our main contributions from this work.  You can read the development of this understanding of value here.

When attempting to make this work in workshop settings (rather than an en masse online tool as MIT have constructed) we problem we found was one of calibration.  A picture contains a vast amount of rich information, all of which is interpreted by each individual with reference to their unique, tacit construct frames.  We found the data generated useful in the sense that it stimulated the debate among stakeholders from which sensemaking could  be structured, but not very reliable due to these biases in stakeholder judgements.  The resulting quantifications were workable, but they were meaningful only to those stakeholders who had participated in the debate around which an agreed interpretation and meaning was assigned to each judged image.

This whole calibration problem seems to be overlooked by the MIT work, which is somewhat odd given its prominence in the literature.  The MIT work does, however, appear to have successfully gained a large number of evaluations.  Perhaps they have been able to account for irrationality through shear dint of having so much data.  They don’t mention it in the supporting paper, though: just controlling for demographics.

Comments are closed.