These kinds of applications would benefit from the availability of several types of information, such as which cameras provide the best views in terms of some specified quality measures, or where other cameras are positioned with respect to one specific recording camera.In this work, we perform multimodal analysis of videos recorded by multiple users at a public happening in order to extract infor