Report submitted to the Canadian Radio-television and Telecommunications Commission in relation to the Broadcast Notice of Consultation CRTC 2011-102
By Scott Norcross, Michel Lavoie and Louis Thibault
Advanced Audio Systems, Communications Research Centre Canada
CRC Report CRC-RP-2011-001
Ottawa, 18 March 2011
The views expressed in this report are those of the authors. The CRTC was not involved in formulating any of the report’s recommendations nor did they participate in the authors’ considerations of the issues.
Viewer complaints over loud television content, particularly advertisements, have been around since the early days of television. The causes of this problem include the inconsistent use of the available dynamic range of audio programmes, and industry guidelines that were largely limited to preserving signal integrity. Where guidelines did refer to loudness, the lack of an industry-recognized method for measuring loudness made efforts to control it difficult to put into practice.
The ATSC digital television system, for its part, includes a mechanism known as metadata for managing loudness, but unless it is used correctly throughout the broadcast chain, the problem will persist.
In 2006, the ITU-R adopted a proven method to allow broadcasters and content creators to measure the loudness of their programmes. In consequence to this, key organizations in the broadcast industry such as the Advanced Television System Committee in North America, the European Broadcast Union in Europe, the Broadcast Committee of Advertising Standards in the UK and FreeTV Australia, have recently updated their regulations or recommended practices to incorporate the use of the ITU-R meter with the goal of providing consistent loudness levels to the viewer. Supporting these efforts has been the increasing availability of commercial products that measure loudness using the method adopted by the ITU-R.
Throughout this document, a number of key terms describing different attributes of sound will be used. Those are:
A sound signal at a given(rms) levelwill produce, when played through loudspeakers, an intensity at the ear which in turn will be perceived with a givenloudnessby the listener. Increasing or decreasing thelevel will in turn increase or decreaseintensityandloudness. The relationship between the three parameters is proportional but not straightforward.
The human ear is a remarkable instrument. The difference between the faintest sound it can detect and the most intense sound it can sustain without being damaged is about 120 dB. This corresponds to a ratio of 1,000,000,000,000:1! While it is beyond the scope of this report to get into any detailed description of the human auditory system, this section provides a brief summary of the main characteristics of the human ear as they relate to loudness.
It is well established that the sensitivity of the human ear varies with the frequency of a sound. The ear is frequency selective. It is most sensitive in the mid-frequency range 1000 – 5000 Hz, while sounds at low and high frequencies need to be gradually more intense in order to be perceived with the same loudness as sounds at mid-frequencies. CRC studies have shown  that the ear can detect loudness differences as small as 0.5 to 1.0 dB for real-life sounds (speech, drama, music).
Factors that influence the perception of loudness include the intensity and frequency content of the sound – whether narrow or spread over a given range of frequencies. Loudness is also a subjective quantity. As such, reliable methods to measure loudness must be based on some modeling of the human auditory system.
The problem of loudness variations between television stations and of excessive loudness changes in television programming has been around for many years. In the United States, inquiries by the FCC date back to as early as the 1960’s . Over the past decades, broadcasters have integrated a variety of devices and practices in an attempt to measure and to mitigate loudness variations with varying levels of success. Yet the problem persists.
The difficulty in measuring loudness is attributable to its subjective nature. A number of loudness measurement algorithms and methods have been developed over the years. The most popular ones includes Zwicker’s loudness model (ISO 532-1975) , the CBS Loudness Summation Method , Leq(m), Leq(A) and Leq(C). None of these techniques have been standardized nor widely accepted for use in the broadcast industry.
Lacking any accepted method for measuring loudness levels, the focus of radio and television broadcast industry guidelines has been to recommend limits on peak or quasi-peak audio signal levels, leaving the broadcasters a measure of freedom to use the available audio dynamic range as they saw fit. While such guidelines are effective in minimizing undesired distortion and interference, they do not address loudness. In such cases where different broadcasters have different dynamic range requirements, loudness variations can occur when the listener switches channels, forcing him/her to manually adjust the volume. Annoying loudness variations can also occur between programmes of a given channel since the existing guidelines and recommended practices do not sufficiently account for loudness.
The audio signals accompanying digital television are characterized by a lower noise floor and wider dynamic range as compared to analog television. Used effectively, these properties can result in improved audio quality and a higher level of listener enjoyment. An inconsistent use of digital television’s wider audio dynamic range, whether unintentional or for perceived competitive advantage, can result in even greater loudness variations than has been the case for analog television.
Dynamic range compression is a process that narrows the difference between high and low audio signal levels.
Devices for controlling audio dynamic range can be found throughout the broadcast chain. In the recording studio, dynamic range compression is used for artistic effect (i.e. to achieve a desired balance between soft and loud programmes parts) and to ensure that audio signal levels remain within the dynamic range of the recording medium or transmission channel. At the transmission end, audio compressors are critical as a mechanism to ensure that over-modulation of the signal carrier or sub-carrier does not occur.Dynamic range compression can also be used to raise the loudness of one signal over that of another. The human hearing mechanism perceives loudness in relation to the power of the audio signal. Dynamic range compressors can be adjusted to raise the average power of a signal while keeping the signal peaks below desired limits. This frees the programme creator to raise the overall average level without fear of overloading the system. In contrast, a programme that requires more headroom must maintain lower average power levels. In this manner, broadcast stations that maintain higher average power levels will sound louder than stations that maintain lower average levels, even when peak levels are equal. Similarly within a broadcaster’s programme schedule, programmes that maintain higher average levels will be louder than adjacent programming recorded at lower average levels.
Digital television in the USA and Canadaincludes a mechanism known as metadata that can be used to correct loudness differences between programmes. Metadata is information that describes various characteristics of the audio signal, and travels along with the audio bitstream of a given programme. In the ATSC system, the metadata parameter that indicates loudness is called dialnorm. It informs devices of the loudness level of the incoming signal, thus allowing these devices (such as the home audio decoder) to make the necessary corrections to ensure that all programmes are played at the same preset loudness level.
This mechanism allows programme providers to operate at loudness levels that are appropriate to their practices. However, it also relies on the accuracy of the transmitted loudness metadata and on their correct handling throughout the broadcast chain so that loudness correction can operate accurately in the home. The incorrect or inconsistent setting of dialnorm can lead to significant differences in loudness between programmes, even when the programme loudness levels are the same. Examples of inconsistent use of metadata include the following:
The most commonly used instruments for monitoring and measuring time-varying audio signal levels are volume-unit (VU) and peak-programme (PPM) meters. The fast-reacting nature of the PPM meter makes it particularly useful to monitor peak signal levels and the usage of available headroom. For this reason, PPM measurement specifications have been an important component in broadcast guidelines. However, neither instrument indicates loudness. Rather, the operator must infer loudness from the continuously changing meter readings, a process requiring interpretation skills which may introduce errors and inconsistencies. A further shortcoming of the VU or PPM meter is that neither account for the frequency selectivity of the human ear.
The effort to adopt a universal method for measuring subjective loudness was initiated at the International Telecommunications Union Radiocommunication Sector (ITU-R) in September 2000 when Working Party 6P and Study Group 6 approved Question ITU-R 2/6 titled “Audio metering characteristics suitable for use in digital sound production”. In this Question, the ITU-R recognized that “current knowledge of human psychoacoustics may make it possible to create a metering algorithm that would provide for indication of perceived loudness” and that “the state of digital signal processing makes it practical to implement complex algorithms into cost-effective devices”. The main applications contemplated by the ITU-R for such a loudness meter was to facilitate programme delivery and programme exchange between broadcasters as well as loudness measurement in programme production and post-production.A Special Rapporteur Group (SRG) was formed with the task of identifying an objective method to measure the perceived loudness of typical broadcast material. The initial phase of the SRG study was to examine the perceived loudness of typical broadcast material. Audio material was collected from the study participants and a subjective test method was devised. A general description of the 96 selected test sequences is provided in Table 1.
Table 1 : Breakdown and description of audio items used in ITU-R subjective tests.
|Audio content description||Number of Items|
|Drama (dialogue with background sounds)||4|
|Speech with background music||22|
|Speech with background sounds||28|
|Music with lead singer||6|
|Singing voice with no instruments||4|
|Sound effects with no speech||2|
A series of formal subjective tests were conducted at five separate locations in Canada, Australia and the United Kingdom. For the test, subjects were tasked with matching the loudness of the various audio test clips to a reference speech clip. A total of 97 subjects participated in the subjective tests, and the results were used to create a database of perceived loudness values for the 96 monophonic audio clips of Table 1.
performance criteria defined by the SRG members and meter proponents, the meter that best predicted the subjective loudness of the test items shown in Table 1 was chosen to become the basis for the ITU-R recommendation .
To further validate the performance of the meter selected by the SRG, two additional rounds of subjective tests were conducted at the CRC with this meter only. The first of these two additional tests used 96 new monophonic audio sequences while the second used 144 mono, stereo and multichannel sequences . The performance of the ITU-R loudness meter is shown in Figure 1 where objective versus subjective loudness is plotted for all three subjective tests, totaling 336 audio sequences (mono, stereo and multichannel). A correlation of 0.977 indicates a very good agreement between objective and subjective loudness. Subsequent subjective testing by other researchers confirmed the performance of the ITU-R loudness algorithm relative to more complex psychoacoustic models.
Figure 1 : Comparison of objective versus subjective loudness for the ITU-R Loudness Meter
The loudness algorithm, supplemented with input from other administrations, was formally adopted by the ITU-R in 2006 with the publication of ITU-R Recommendation BS.1770 “Algorithms to measure audio programme loudness and true-peak audio level”. Additional clarifications were added in a 2007 re-issue of the Recommendation (BS.1770-1)  without making any changes to the measurement algorithm.
In October 2010, a “gating” method was added to the loudness algorithm to further ensure accurate loudness readings when measuring audio containing extended silence or quiet passages. These changes should appear in a revised version of ITU-R Rec. BS.1770 expected in spring 2011.
One of the distinctive features of the ITU-R BS.1770 loudness metering algorithm is its relatively low implementation complexity. A simplified description of algorithm is shown in Figure 2.
Figure 2 : Simplified diagram of the ITU-R BS.1770 loudness algorithm
The loudness metering algorithm can be used to measure monophonic, stereo, and multichannel audio signals. It works by averaging (Mean square) the frequency-weighted (K filter) signal power for each input audio channel (Left, Right, Centre, Left surround, Right surround) over a desired measurement interval. The average power of each channel is then summed using appropriate channel gains G. The K filter mimics the frequency selectivity of the human ear at low frequencies. The gain G accounts for the angle of arrival of the sound relative to the head. The integration time interval is not specified, enabling the measurement of programmes of different length to be compared. In practice, it is typically used to measure a “long term” average loudness over the entire duration of the programme (e.g. TV programmes, commercials, etc.).The ITU-R has designated the loudness measurement units as LKFS, where L means Loudness, K means K-weighted and FS means relative to Full Scale. A change in loudness of 1 LKFS is equivalent to a 1 dB change in signal level.
In March 2010, the ITU-R published Recommendation BS.1864 “Operational practices for loudness in the international exchange of digital television programmes” . This document recommends a target loudness level of -24 LKFS for the international exchange of digital television programmes.
In the United States and Canada, digital television conforms to the ATSC A/53 standard . The publication by the ITU-R in 2006 of an internationally recognized method for measuring the nominal loudness of a programme led the television broadcast industry to update operational practices to address the problem of undesired loudness variations.
In 2008, the Advanced Television Systems Committee (ATSC) began drafting guidelines to assist the television industry in providing consistent loudness levels to their listeners. This effort culminated with the publication in November 2009 of Recommended Practice A/85 “Techniques for Establishing and Maintaining Audio Loudness for Digital Television” . This document provides a detailed guideline for loudness management in the creation, distribution and emission of television content using devices that implement the technique of ITU-R Recommendation BS.1770 to measure loudness. A key component of this guideline is the correct use of the loudness metadata (dialnorm) that accompanies the digital audio bitstream (see section 2.2). In the absence of dialnorm, the target loudness level of the audio should be -24 LKFS. When dialnorm is used, the A/53 standard requires that dialnorm value indicates the loudness level of the “audio content (typically of the average spoken dialogue)” .
In December 2010 the United States government enacted the Commercial Advertisement Loudness Mitigation (CALM) Act  directing the FCC to prescribe regulations incorporating the loudness management practices described in ATSC A/85 to control and limit the loudness of commercials. As of March 2011, the ATSC is in the process of revising document A/85 to make it a regulation rather than a recommended practice. Once adopted, the television industry is given one year to implement these regulations
In parallel with the efforts of the ATSC, members of the European Broadcast Union (EBU) established a study group (P/LOUD) to investigate new practices with the goal of providing more consistent loudness levels to the listener. The scope of their work was to include radio as well as television systems that did not include audio metadata. Their work also included investigating new forms of loudness and peak signal metering to complement or to replace existing VU and PPM meters.
In August 2010 the EBU published Recommendation R128 “Loudness normalization and permitted maximum level of audio signals” . EBU R128 recommends that the loudness of audio be measured using the ITU-R BS.1770 method supplemented by a “gating” method” to account for silent and soft passages in the audio signal . Moreover, programmes should be normalized to a target loudness level of -23 LKFS with a tolerance of ± 1 LK. This is in contrast to the -24 LKFS target level recommended by the ITU-R and by the ATSC for programmes not accompanied by metadata.
Supporting documents include EBU Tech. 3341 , 3342 , 3343 , and 3344  which describe in greater detail such aspects as measurement methods, loudness range, loudness metering characteristics, and production, distribution and transmission guidelines.
In the United Kingdom, radio and television broadcasters are licensed by its independent regulator Ofcom. Regarding television advertisements loudness levels, broadcasters must adhere to rules set by the Broadcast Committee of Advertising Standards (BCAP) Code . Section 4.7 of the BCAP Code stipulates that:
“Advertisements must not be excessively noisy or strident. The maximum subjective loudness of advertisements must be consistent and in line with the maximum loudness of programmes and junction material.
Broadcasters must endeavour to minimise the annoyance that perceived imbalances could cause, with the aim that the audience need not adjust the volume of their television
sets during programme breaks. For editorial reasons, however, commercial breaks sometimes occur during especially quiet parts of a programme, with the result that
advertisements at normally acceptable levels seem loud in comparison.
Measurement and balancing of subjective loudness levels should preferably be carried out using a loudness-level meter, ideally conforming to ITU recommendations2. If a peak-reading meter is used instead, the maximum level of the advertisements must be at least 6dB less than the maximum level of the programmes to take account of the limited dynamic range exhibited by most advertisements.
2The relevant ITU recommendations are ITU-R BS1770 Algorithms to measure audio programme loudness and true-peak audio level and ITU-R BS1771 Requirements for loudness and true-peak indicating meters.”
It is worth mentioning that the BCAP Code does not specify any target loudness level for programmes or advertisements measured with the ITU-R BS.1770 meter.
In Australia, FreeTV Australia (FTVA), is that country’s industry body representing free-to-air television broadcasters. In July 2010 the organization published Operational Practice OP-59 , laying out a set of recommendations describing how television broadcasters are to manage and control loudness of programmes, promotional spots and commercial advertisements. It is intended as an aid to avoiding the excessive loudness contrasts that can be annoying to consumers.
FTVA receives its mandate in this area from Australia’s television licensing authority, the Australian Communications and Media Authority (ACMA). While adherence to the guidelines of OP-59 is on a voluntary basis, the broadcasters have signaled their desire to abide by them.OP-59 recommends that television programmes not employing metadata be normalized to -24 LKFS as measured with the ITU-R BS.1770 algorithm. Where metadata is in use, the
metadata value that indicates loudness should correspond to the loudness measured as per ITU-R BS.1770.
Table 2 shows a partial list of available loudness meters that implement the ITU-R BS.1770 loudness measurement algorithm. The meters are offered as hardware or software products. In this list, a total of 60 loudness meters made by 24 manufacturers are identified. The increasing variety and availability of loudness metering devices is an indication of a growing demand within the industry for such capabilities and of the widespread acceptance of the ITU-R loudness measurement algorithm.
Table 2 : Partial list of available loudness meters that use ITU-R BS.1770.
|Broadcast Project Research||LDB ITU||H||http://www.bpr.org.uk/|
|Grimm Audio||Level One||S||http://www.grimmaudio.com|
|IK Multimedia||T-tracks 3||S||http://www.ikmultimedia.com|
|Image Line Software||Wave-Candy||S||http://www.image-line.com|
|Junger Audio||T*AP Television Processor||H||http://www.junger-audio.com|
With the adoption by the ITU-R in 2006 of an objective method to measure the average loudness of an audio programme, the broadcast industry has begun the transition of normalizing audio based on maximum signal levels to a practice of normalizing audio to average loudness while maintaining maximum levels below the technical limits of the audio system.
The efforts to modify practices and to update specifications is witnessed by the release in 2009 of Recommended Practice A/85 “Techniques for Establishing and Maintaining Audio Loudness for Digital Television” by the ATSC, and in 2010 by Recommendation R128 “Loudness normalization and permitted maximum level of audio signals” by the EBU. Supporting these efforts has been the increasing availability of commercial products that measure loudness using the method described in ITU-R BS.1770.
The key motivation behind ATSC A/85 and EBU R128 has been to address the problem of undesired loudness changes. While traditionally advertisements have been seen as the main source of loudness problems, the recommendations in these documents apply to all programming.
1. A modified version of the gating method described in the EBU documents has been adopted by the ITU-R and will be incorporated into a revised version of ITU-R Rec. BS.1770 slated for release in spring 2011. The EBU is expected to modify their measurement system to conform to that of the ITU-R once the revised ITU-R Rec. BS.1770 is released.
1. William A. Yost, “Fundamentals of Hearing – An Introduction”, Academic Press, San Diego, USA, 1994.
2. Brian C. J. Moore, “An Introduction to the Psychology of Hearing”, Academic Press, San Diego, USA, 1989.
9. Seefeldt, Alan and Lyman, Steve, “A Comparison of Various Multichannel Loudness Measurement Techniques”, 121st Convention of the Audio Engineering Society, 2006
15. EBU Technical Recommendation R128, “Loudness Normalisation and Permitted Maximum Level of Audio Signals”, Geneva, 2010. [http://tech.ebu.ch/docs/r/r128.pdf]
16. EBU Tech Doc 3341, “Loudness metering: ‘EBU Mode’ metering to Supplement Loudness Normalisation in Accordance with EBU R128”, Geneva, 2010. [http://tech.ebu.ch/docs/tech/tech3341.pdf]
17. EBU Tech Doc 3342, “Loudness Range: A Descriptor to Supplement Loudness Normalisation in Accordance with EBU R128”, Geneva, 2010. [http://tech.ebu.ch/docs/tech/tech3342.pdf]
18. EBU Tech Doc 3343, “Practical Guidelines for Production and Implementation in Accordance with EBU R128”, Geneva, 2011. [http://tech.ebu.ch/docs/tech/tech3343.pdf]