Noise Health Home 

[Download PDF]
Year : 2007  |  Volume : 9  |  Issue : 34  |  Page : 8--14

The reliability of the noise sensitivity questionnaire in a cross-national analysis

Stephan Sandrock, Martin Schutte, Barbara Griefahn 
 Institut f�r Arbeitsphysiologie an der Universit�t Dortmund, Ardeystra�e, Dortmund, Germany

Correspondence Address:
Stephan Sandrock
Institut f�r Arbeitsphysiologie an der Universit�t Dortmund, Ardeystra�e 67, 44139 Dortmund


Noise sensitivity is regarded as a relevant predictor for annoyance reactions. Since many studies have focused on noise sensitivity at an international level, the present analysis was conducted to detect national peculiarities concerning noise sensitivity. Using the approach of the generalizability theory, reliability of the noise sensitivity questionnaire was analyzed taking into consideration relevant facets assumed to contribute to the measurement error. A total of 126 individuals from seven European countries participated in this study. The reliability coefficients for the global noise sensitivity score ranged from 0.90 to 0.91. It was determined that the translated questionnaires are comparable.

How to cite this article:
Sandrock S, Schutte M, Griefahn B. The reliability of the noise sensitivity questionnaire in a cross-national analysis.Noise Health 2007;9:8-14

How to cite this URL:
Sandrock S, Schutte M, Griefahn B. The reliability of the noise sensitivity questionnaire in a cross-national analysis. Noise Health [serial online] 2007 [cited 2023 Sep 23 ];9:8-14
Available from:

Full Text

Environmental noise, especially traffic noise is known to cause annoyance in many people. [1] According to Miedema and Vos, [2] who reviewed 28 studies with respect to noise sensitivity, the percentage of individuals annoyed is mainly related to the noise exposure level. The respective dose-response relations are characterized by a considerable variance due to individual and situational factors. Many studies, in which annoyance has been examined, have dealt with noise sensitivity as one predictor for human responses to noise. [3],[4] Noise sensitivity is considered to be a stable personality trait, which is reflected in an individual's attitude toward noise sources [5],[6] and constitutes a decisive predictor for the degree of annoyance in relation to environmental noises. [7],[8]

Various questionnaires for the assessment of noise sensitivity have been developed. The Weinstein noise sensitivity scale [5] was developed for assessing global noise sensitivity. Since the measurement properties and the applicability of this questionnaire were tested in college students, it was remarked of as inappropriate for other individuals. [9] Due to this limited applicability of the Weinstein Scale, Zimmer and Ellermeier [5],[9] developed another questionnaire, the "Lδrmempfindlichkeitsfragebogen" (LEF) to measure global noise sensitivity. In this questionnaire, the subjects are asked to rate their affective, cognitive, and behavioral reactions to noises by means of 52 items. Further, Zimmer and Ellermeier reported a four-factorial structure explaining 37.5% of the variance, suggesting that noise sensitivity depends on different daily situations. Consequently, the noise-sensitivity-questionnaire (NoiSeQ) was developed, [10],[11] which assesses the total noise sensitivity with respect to five different daily situations, namely, work, communication, leisure, sleep, and habitation. Therefore, this questionnaire consists of five subscales with seven items each. Each of these items consists of a statement where the participants indicate their agreement.

The Weinstein scale has been translated in different languages and used in different countries. [6],[12],[13],[14],[15],[16],[17] A major drawback is that the reliability of the different versions of this questionnaire was determined by calculating Cronbach's alpha coefficient, [18],[19] which is a measure for internal consistency. [12],[20] Since only coefficients of internal consistency are available, information on potential systematic differences between the various language versions is not available. However, such data would be particularly helpful while comparing the results of studies conducted in different countries. Several studies analyzing noise annoyance due to noise sensitivity differ in their results with regard to the influence of noise sensitivity. [2],[21] One possible reason for these dissimilarities might be the differences in the measurement properties of the translated versions. When a noise sensitivity measuring instrument is tested simultaneously in several countries, it is possible to obtain information on the comparability of the various translated versions. Well-elaborated [22] and translated [23],[24] intercultural tested instrument to measure noise annoyance are already available. [25]

Since NoiSeQ permits differentiated and reliable measuring of noise sensitivity, the questionnaire written in the German language was translated into several other languages to allow consistent recordings of noise sensitivity in various countries. The present study investigates the reliability and the comparability of several translated versions of NoiSeQ.


First, NoiSeQ was translated into English with the help of an expert panel. This expert panel consisted of scientists with long-term experience in research on the effects of noise. All of these scientists spoke English, and they were expected to further translate the questionnaire into their own languages. Each item of the German version was explained and translated into English. After the preparation of the English version, the scientists translated the questionnaire into Dutch, French, Italian, Hungarian, and Swedish. This approach follows Harkness's [26] guidelines and is one of the commonest used in translating questionnaires. Typically, five basic procedures, namely, translating, reviewing, adjudicating-deciding on a version, pretesting, and finally, documenting were followed to develop the final version of the translated questionnaires; these procedures had to be repeated several times for final development. The expert committee translation, review, and adjudication were performed in accordance with Harkness's guidelines. [26] While translating questionnaires, several problems with regard to the intended meanings may occur; [27] therefore, special attention was paid to national idioms and adequate translation of the answer scale.

Reliability analysis

Generalizability theory (G-theory)

Since the examination of reliability of NoiSeQ is based on the G-theory, [28] analysis of further potential facets contributing to error variance was performed. Referring to Cronbach et al . [28] and Webb and Shavelson, [29] the G-theory is a statistical theory concerning dependability of behavioral measurements. In comparison to classical test theory, the G-theory has the advantage that it allows simultaneous estimation of facets, which contribute to the measurement error that affect the reliability of an instrument. [28],[30]

G-theory is based on the statistical model of analysis of variance (ANOVA). By means of ANOVA the total variance of data is partitioned according to the independent variables in the design. [31],[32] The differences between the Generalizability (G) study and the Decision (D) study need to be noted. In the G-study, the effect of sources of variance that are assumed to contribute to measurement error-also called facets-is calculated by the estimation of variance components; this is calculated on the basis of the mean squares obtained via corresponding ANOVA. Facets can be assumed to be either fixed or random in correspondence with the factors in ANOVA. Different parameter values within a facet are defined as conditions, which are analogous to levels in ANOVA. If conditions within a facet are random samples from the universe that define the facet itself, it is legitimate to generalize to other but similar conditions of a facet. On the other hand, a fixed facet implies that either the whole population of interest is encompassed or the generalization is limited to the conditions included in the design. Since the individual peculiarity of noise sensitivity is assessed, the matter of interest is the reliability with which NoiSeQ is able to differentiate between individuals, and these individuals are the object of measurement for this issue. In the subsequent D-study, the coefficient of generalizability for the intended measurement purpose is determined. This coefficient (ρ2 ) varies between 0 and 1 and indicates (for relative decisions) how well a person can be located relatively to other individuals of the population on the basis of the measurement. The absolute coefficient phi (φ) needs to be calculated to conclude the absolute position of a person.

Reliability of the German version

The results of the reliability of the global and subscale scores of the German version of NoiSeQ are shown in [Table 1]. According to ISO 10075-3:2004, the instrument should satisfy certain specifications depending on the purpose of measurement. An instrument is applicable for precision measurement, if the relative or absolute G-coefficient is ≥0.9. For screening and orienting of measurements, reliabilities of ≥0.8 and ≥0.7, respectively, are recommended. According to this classification scheme, the global score satisfies the requirements for precision measurements. The subscale leisure fails to reach the minimal required value of 0.7. The subscale sleep allows screening measurements. The remaining scales can be used for orienting measurements.

Reliability of the translated versions

The aim of the present study was to determine whether the different translations of NoiSeQ were comparable to each other. Similarity amongst the different translations can be assessed if no cross-cultural effects occur. In the analysis, the individuals represented the objects of measurement. Furthermore, the facets translations (country), subscales, and items are taken into consideration for the ANOVA model. The countries represent a random sample from the European Community, and therefore, this facet is treated as a random effect. Furthermore, the items are considered as a random sample from the population of all possible items describing noise sensitivity. In contrast, five subscales are assumed to reflect the relevant daily situations. Therefore, this facet is regarded as a fixed effect for ANOVA. Facets cannot be combined completely, since one participant belonged to only one country and an item represented only one subscale. Therefore, the ANOVA was based on a hierarchical model with individuals nested within countries, and items nested within subscales.

 Materials and Methods


A total of 126 participants, 18 from seven countries each (Belgium, France, Germany, Hungary, Italy, Sweden, United Kingdom) comprised the study sample. Based on their age, the participants were grouped at an age difference of 10 years. Their age range was 18-70 years. Women comprised 41.3% of the study sample. The participants were asked to complete the questionnaire by using a PC prior to participating in an experiment. They were paid for their participation.

G-Study- global assessment of noise sensitivity

First, a four-factorial ANOVA (factor 1: person (P) nested within country, factor 2: country (C), factor 3: subscale (S), and factor 4: item (I) nested within subscale) was performed. Based on the calculated mean squares the components of variance were estimated.


Based on the estimated variance components [Table 2], it was observed that the difference between the countries was small; only 0.8% of variance was explained by the facet country. Furthermore, the interactions of the facet country involved small proportions of variance; 0.8% of variance can be traced back to the interaction C*S and 3.12% of variance is explained by the interaction C*I:S.

Inter-individual differences explained 18% of variance. Due to the hierarchical design, the extent to which the P*C interaction counts for variance cannot be clarified.

The subscales do not differ systematically, and therefore, seem to be equivalent with respect to their mean values. The facet item accounts for 13.8% of variance indicating inconsistencies concerning their content. Furthermore, the interaction S*P:C accounts for 9.2% of variance, which leads to the assumption that the level of individual noise sensitivity varies according to the subscale, and therefore, depends on the considered situation. The huge proportion of residual variance (54.9%) is due to the interaction P*I:CS and unsystematic error. Since no negative estimates of variance components were observed, it can be determined that the chosen model is appropriate. In conclusion, the results show that the facet country need not be considered as relevant in the following D-study.

D-study-global assessment of noise sensitivity

The D-study aimed at determining the reliability of NoiSeQ taking into consideration only the relevant facets that contributed to error, namely, scales and items. Data from G-study were used in the analysis.

[Table 3] shows the results of ANOVA. In addition the variance components and the according standard errors are depicted. Since the latter are small it can be concluded, that the estimates of the variance components are stable. The variance component for individuals accounts for more than 18% of variance. The proportion of variance that is related to the facet item is about 14% which supports the assumption that the level of the ratings depends on the description of the situations. Furthermore there are differences between individuals concerning the different daily situations, which can be concluded from the variance component due to the interaction P*S. The residual term counts for about 58% of variance and is related to the interaction P*I:S and further systematic or unsystematic error.

The estimated variance components are then used to calculate the G-coefficient, whereby in terms of G-theory a distinction is made between the relative and the absolute G-coefficient. [31],[32],[33] The relative G-coefficient corresponds with an intra-class correlation coefficient. It should be denoted that reliability refers to the average value for one person based on the mean of seven items and five subscales, if the global score is of interest.


Equation 1: relative G-coefficient


Equation 2: absolute G-coefficient

According to these equations, the relative G-coefficient amounts to 0.92 and the absolute G-coefficient amounts to 0.90. Using equation 2, it can be calculated that the absolute coefficient is slightly lower, because the variance component of the facet item is also taken into consideration as a measurement error. Nonetheless, both the coefficients reach the precision measurement level as recommended by ISO 10075-3:2004. [34] In addition, the confidence intervals have been calculated for the global score based on the mean of all 35 rating values. The confidence interval were � 0.27 (α = 0.05) and � 0.36 (α = 0.01), respectively.

D-study-assessment of reliability for the subscales

For the determination of the reliability of each subscale, two-factorial ANOVA was performed. [Table 4] presents the variance components and their corresponding standard errors. In total, these standard errors are relatively small, and therefore, the estimated components of variance show no instabilities. The proportions of variance due to the factor person range between 13.4% (leisure) and 39.1% (sleep).

The proportion of variance explained by the facet item varies between 6.6% (habitation) and 23% (leisure). The residual varies between 51.5% (communication) and 74.2% (habitation).

With regard to the subscale leisure, it is noticed that the estimated variance component for the facet item is larger than the component for the object of measurement (person). It can be assumed that this finding results from the fact that items 6 and 7 are rated much lower, on an average, than the other items of this scale as shown in [Figure 1].

According to the coefficients calculated using the equations shown in [Table 5], the relative G-coefficients of the scales sleep, communication, and work exceeded the value of 0.8 as recommended for screening measurements by ISO 10075-3:2004. The scales leisure and habitation did not reach the lower limit 0.7. Regarding the absolute coefficients, the scale sleep satisfies the level for screening measurement, the scales communication and work allow for orienting purposes. The subscales habitation and leisure were below the lower limit 0.7.

To elevate the level of reliability, it is possible to extend each scale, not beyond the required criteria using the Spearman-Brown-Formula. [31] Therefore, the number of items needed to reach a coefficient of 0.7 for the insufficient subscales was calculated. It was observed that with regard to the subscale leisure, 12 items would be necessary for the relative G-coefficient to take a value above 0.7 (orienting measurements) and 15 items would be required for the absolute coefficient of a person. Concerning the subscale habitation, a G-coefficient of ≥ 0.7 would require 9 items for relative decisions and 10 items would be required for absolute decisions, respectively.


The purpose of this study was to ascertain whether the different translations of NoiSeQ are comparable to each other. If biases due to translations could be ruled out, [35] the instrument is assumed to be applicable for comparative cross-cultural studies concerning the example of noise annoyance with regard to noise sensitivity.

The results of this study suggest that not only global noise sensitivity but also noise sensitivity related to work, sleep, and communication can be measured with sufficient reliability using this questionnaire.

However, the subscales habitation and leisure do not allow reliable measurements. The results with regard to the subscale leisure were striking since there is a huge amount of variance due to the facet items, which is much larger than the variance explained by the individuals. This may be due to the items "When I am at home I find it uncomfortable if the radio or TV is left on in the background" (no. 6) and "I avoid leisure activities which are loud" (no. 7), since these items are rated much lower than the other items on this scale. It can be assumed that it is a common habit amongst most people to switch on the radio to hear the news, and therefore, item 6 is not adequate for measuring noise sensitivity. The low ratings within the item 7 may be due to the fact that people are more concerned about the content of the leisure activities than the environmental noise.

Both the items seem to provide only marginal knowledge about the individuals' noise sensitivity during leisure. It could be assumed that an exchange of these items would reduce the variability of the facet item. Another solution is that the number of the items can be increased; although, 12 items are necessary to reach a relative G-coefficient of 0.7. Based on the Spearman-Brown-Formula, for an absolute decision 15 items would be required. [31] Considering the comparability of the scales and economic reasons, it seems more profitable to exchange these items with others depending on their content.

The high amount of residual variance within the subscale habitation could be due to the P*I interaction, which means that every individual evaluates each item differently, or due to unsystematic disturbances that contribute to the measurement error; however, both these sources cannot be separated from each other. To raise the reliability of this subscale to a level required for orienting measurements 2 to 3 additional items would be required. Although this number of the additional items appears to be acceptable, the emphasis will increase on the scale contributing to the global noise sensitivity, which is unfavorable. Hence, in the further analyses, the revision of the entire scale should be taken into consideration.

The van de Vijver [36] approaches of ANOVA are common techniques used in analyzing cross-cultural data, which supports the choice of the accomplished analysis. One of the major contributions of the approach used in this study is its focus on the multiple sources of measurement error. [33]

Albeit the assumption of an instrument allowing precise measurements of noise sensitivity, validity has to be confirmed and this has been accomplished for the scales work and habitation for the German version. [11] Noise sensitivity as a stable personality trait should not be related with actual states, which is shown by Leue et al. , [37] who analyzed the relationship between NoiSeQ and a questionnaire (Mehrdimensionaler Befindlichkeitsfragebogen (MDBF)) developed for the assessment of the actual mood of the individual. [38] Therefore, further examinations are required to verify the validity of NoiSeQ in cross-cultural contexts.


In the present study, the reliability of the NoiSeQ-in a cross-national design was analyzed on the basis of G-theory. The application of G-theory instead of the classical test theory permits the researchers to receive information regarding the possible facets contributing to the measurement errors. Since the potential differences between the translated versions had to be determined, the calculation of reliability coefficients like Cronbach Alpha coefficient would not have been promising. [19] The results of the calculated variance components showed that there were no systematic differences between the translations. Moreover, the calculated G-coefficients provide an explanation to assume that the classification of individuals can be done precisely according to ISO 10075-3:2004, if the global noise sensitivity is taken into consideration. The three subscales work, sleep, and communication also allow reliable classifications. It is assumed that NoiSeQ will help in further analyses that will contribute to the clarification of unexplained variance concerning noise annoyance.


The authors' gratitude is due to Ian Flindell, Giorgio Irato, Karl Janssens, Catherine Lavandier, Ferenc Marki, and Shafiquzzaman Khan who supported the translation and application of NoiSeQ in the participating countries.


1Guski R. Personal and social variables as co-determinants of noise annoyance. Noise Health 1999;3:45-56.
2Miedema HM, Vos H. Noise sensitivity and reactions to noise and other environmental conditions. J Acoust Soc Am 2003;113:1492-504.
3Griefahn B, Di Nisi J. Mood and cardiovascular functions during noise, related to sensitivity, type of noise and sound pressure level. J Sound Vibration 1992;155:111-23.
4Ouis D. Annoyance caused by exposure to road traffic noise: An update. Noise Health 2002;4:69-79.
5Weinstein N. Individual differences in reactions to noise: A longitudinal study in a college dormitory. J Appl Psychol 1978;63:458-66.
6Stansfeld SA. Noise, noise sensitivity and psychological studies. vol. 22. In : Psychol Med Monograph Suppl, Cambridge: Cambridge U.P; 1992.
7Job RF. Community response to noise: A review of factors influencing the relationship between noise exposure and reaction. J Acoust Soc Am 1988;83:991-1001.
8Job RF. Noise sensitivity as a factor influencing human reactions to noise. Noise Health 1999;3:57-68.
9Zimmer K, Ellermeier W. Konstruktion und Evaluation eines Fragebogens zur Erfassung der individuellen Lδrmempfindlichkeit. Diagnostica 1998;44:11-20.
10Sch�tte M, Marks A. Entwicklung des Dortmunder Lδrmempfindlichkeits-Fragebogens (DoLe). In : Gesellschaft f�r Arbeitswissenschaft (Ed.), 50. Kongress der Gesellschaft f�r Arbeitswissenschaft. GfA Press: Dortmund; 2004. p.387-390.
11Sch�tte M, Marks A, Wenning E, Griefahn B. The development of the noise sensitivity questionnaire. Noise Health 2007;9:15-24.
12Ekehammar B, Dornic S. Weinstein's noise sensitivity scale: Reliability and construct validity. Percept Mot Skills 1990;70:129-30.
13Dornic S, Ekehammar B. Extraversion, neuroticism and noise sensitivity. Personality Ind Diff 1990;11:989-92.
14Zimmer K, Ellermeier W. Eine deutsche Version der Lδrmempfindlichkeitsskala von Weinstein. Zeitschrift f�r Lδrmbekδmpfung 1997;44:107-10.
15Belojevic G, Jakovljevic B. Subjective reactions to traffic noise with regard to some personality traits. Environ Int 1997;23:221-6.
16Belojevic G, Jakovljevic B, Slepcevic V. Noise and mental performance: Personality attributes and noise sensitivity. Noise Health 2003;6:77-89.
17Vδstfjall D. Influences of current mood and noise sensitivity on judgments of noise annoyance. J Psychol 2002;136:357-70.
18Cronbach LJ. Coefficient alpha and the internal structure of tests. Psychometrika 1951;16:297-33.
19Cronbach LJ, Shavelson RJ. My current thoughts on coefficient alpha and successor procedures. Educ Psychol Measurement 2004;64:391-418.
20Zimmer K, Ellermeier W. Psychometric properties of four measures of noise sensitivity: A comparison. J Environ Psychol 1999;19:295-302.
21Van Kamp I, Job RF, Hatfield J, Haines M, Stellato RK, Stansfeld SA. The role of noise sensitivity in the noise-response relation: A comparison of three international airport studies. J Acoust Soc Am 2004;116:3471-9.
22Fields JM, de Jong RG, Gjestland T, Flindell IH, Job RF, Kurra S, et al . Standardized general-purpose noise reaction questions for community noise surveys: Research and a recommendation. J Sound Vibration 2001;242:641-79.
23Preis A, Kaczmarek T, Wojciechowska H, Zera J, Fields JM. Polish version of standardized noise reaction questions for community noise surveys. Int J Occup Med Environ Health 2003;16:155-9.
24Yano T, Ma H. Standardized noise annoyance scales in Chinese, Korean and Vietnamese. J Sound Vibration 2004;277:583-8.
25ISO/TS 15666. Acoustics - Assessment of noise annoyance by means of social and socio-acoustic surveys. International Organization for Standardization: Genf; 2003.
26Harkness J. Questionnaire translation. In : Harkness J, van de Vijver F, Mohler P, editors, Cross-cultural survey methods. John Wiley and Sons Inc: New Jersey; 2003.
27Harkness JA, Schoua-Glusberg A. Questionnaires in translation. In : Harkness JA, editor. ZUMA-Nachrichten Spezial Vol. 3, ZUMA: Mannheim; 1998. p. 87-124.
28Cronbach LJ, Gleser GC, Nanda H, Rajaratnan N. The dependability of behavioral measurements: Theory of generalizability for scores and profiles. John Wiley and Sons: New York; 1972.
29Webb NM, Shavelson RJ. Generalizability theory. In : Encyclopaedia of statistical sciences, update. Wiley: New York; 1999. p. 258-66.
30Brennan RL. Generalizability theory. Springer: New York; 2001.
31Shavelson RJ, Webb N. Generalizability theory - A Primer. Sage Publications: California; 1991.
32Di Nocera F, Ferlazzo F, Borghi V. G Theory and the reliability of psychophysiological measures: A tutorial. Psychophysiology 2001;38:796-806.
33Shavelson RJ, Webb N. Generalizability theory: 1973 - 1980. Br J Mathematical Statistical Psychol 1981;34:133-66.
34ISO 10075-3. Ergonomic principles related to mental workload - Part 3: Principles and requirements concerning methods for measuring and assessing mental workload. International Organization for Standardization: Genf; 2004.
35Van de Vijver FJ. Bias and equivalence: Cross-cultural perspectives. In : Harkness JA, Van de Vijver FJ, Mohler PM, editors. Cross-cultural survey methods. John Wiley and Sons: New Jersey; 2003a. p. 143-56.
36Van de Vijver FJ. Bias and substantive analyses. In : Harkness JA, Van de Vijver FJ, Mohler PM, editors. Cross-cultural survey methods. John Wiley and Sons: New Jersey; 2003b. p. 207-33.
37Leue E, Sch�tte M, Griefahn B. Individuelle Lδrmempfindlichkeit und momentane psychische Befindlichkeit. In : Gesellschaft f�r Arbeitswissenschaft. 51. Kongress der Gesellschaft f�r Arbeitswissenschaft. GfA Press: Dortmund; 2005. p. 621-4.
38Steyer R, Schwenkmezger P, Notz P, Eid M. Der Mehrdimensionale Befindlichkeitsfragebogen. G φttingen: Hogrefe; 1997.