In order to encode the facial region of non-speakers in the scene, if any, when there is a person speaking with yet a lower level of accuracy, a quantizer 254, Q3, is provided for quantizing the image data that is inside the modelled ellipses representing the facial regions of non-speakers in the scene.