ABSTRA CTW e carried out a number of subjecti ve experiments for au-dio visual, audio-only , and video-only quality assessment.W e selected content and encoding parameters at very lowbitrates that are typical of mobile applications. Using thesedata, we explore the insuence of video codecs and framerate as well as audio channels and sampling rate on qual-ity. Finally , the optimal trade-of f between bits allocated toaudio and video inside a bitstream is investigated.1. INTR ODUCTIONV ideo quality (VQ) assessment [15, 16] has become ratherwell established by no w ,as evidenced by the number of re-search publications and products available, as well as thecollaborati ve efforts of the V ideo Quality Experts Group(VQEG) and recent standards for TV [8]. Speech and au-dio quality (A Q) assessment techniques ha ve an even longerhistory . Speech and audio quality metrics ha ve been stan-dardized as PESQ [11] and PEA Q [7], respecti vely .Audio visual quality (A VQ), ho we ver,is an entirely dif-ferent matter . There ha ve been a few studies in the past[2, 13], but little w ork has been done at low bitrates. There-fore, we designed a number of subjecti ve experiments usingcontent, codecs, and bitrates typical of emer ging mobile ap-plications. Based on the results of these tests, an analysisof A V coding parameters is presented in this paper . Moredetails on the experiments, the interactions between audioand video quality , and an evaluation of quality metric pre-dictions on the data can be found in [17].The paper is organized as follo ws. Section 2 describesthe experimental setup in terms of source material, test con-ditions, and subjecti ve assessment. The insuence of videocodecs and frame rate on video quality is discussed in Sec-tion 3. The effect of the number of audio channels (monoor tw o-channel stereo) and sampling rate on audio quality isdiscussed in Section 4. Finally ,the optimal bit budget allo-cation trade-of f between audio and video is investigated inSection 5.2. EXPERIMENT AL SETUP2.1. A V Sour ce ClipsThe content of the source clips and the range of coding com-ple xity w as chosen to be representati ve of a typical scenariofor w atching video on a mobile de vice. The source mate-rial comprises 6 short clips of about 8 seconds each. Thevideo and audio content of these scenes is summarized inT able 1. The video source material w as originally in TVformat; for our tests we de-interlaced and do wnsampled itto QCIF frame size (176x144). The audio source materialw as 16-bit PCM stereo sampled at 48 kHz.2.2. T est Material
[1]
Itu-T.
Video coding for low bitrate communication
,
1996
.
[2]
John G. Beerends,et al.
The Influence of Video Quality on Perceived Audio Quality and Vice Versa
,
1999
.
[3]
Kristofer Kjörling,et al.
Spectral Band Replication, a Novel Approach in Audio Coding
,
2002
.
[4]
Sangwook Lee,et al.
Comparison of subjective video quality assessment methods for multimedia applications
,
2007
.
[5]
Stefan Winkler,et al.
Digital Video Quality: Vision Models and Metrics
,
2005
.
[6]
Steven van de Par,et al.
Auditory-visual interaction: from fundamental research in cognitive psychology to (possible) applications
,
1999,
Electronic Imaging.
[7]
Stefan Winkler,et al.
Audiovisual quality evaluation of low-bitrate video
,
2005,
IS&T/SPIE Electronic Imaging.
[8]
Stefan Winkler,et al.
Video quality evaluation for mobile streaming applications
,
2003,
Visual Communications and Image Processing.
[9]
K. Rijkse,et al.
H.263: video coding for low-bit-rate communication
,
1996,
IEEE Commun. Mag..
[10]
Iso/iec 14496-2 Information Technology — Coding of Audio-visual Objects — Part 2: Visual
,
2022
.