Geriatric Depression Scale (GDS)

The Geriatric Depression Scale was developed by Yesavage, Brink, Rose, Lum, Huang, Adey, and Leirer (1982). It was designed specifically for the aged, as a screening instrument for depression. The scale

  • Originally contained 100 items, but was condensed to 30 Is a self-administered test, but can be used in observer-administered formats also
  • The 30 items are yes/no questions.

Later, Sheik & Yesavage (1986) created a short form of the GDS (GDS-SF), which contained 15 items. The original can be referred to as the GDS- Long Form (GDS-LF). Literature is divided in terms of the short form being a suitable substitute (Aikman& Oehlert, 2001; Holroyd & Clayton, 2000).

Cut-off scores for different severities of depression are as follows:

For the long form: Normal 0 – 10, Mild 11 – 20, Moderate to Severe 21 – 30.

For the short form: Normal 0 – 4, Mild 5 – 9, Moderate to Severe 10 – 15.

Validation and Psychometric properties

The scale has a high degree of internal consistency, with a Chronbach’s alpha coefficient of .94, and split-half reliability score of .94. Test retest reliabilities of .85 (p < .001) for one week apart (Yesavage et. al., 1982) and .85 (p < .001) for one month apart (Parmelee, Lawton & Katz, 1989) show that within the time limits, scores reflect stable individual differences.

The GDS is a valid tool for discriminating symptom severity, and presence vs absence based on DSM-IV criteria, but not among different diagnostic groups. It should not be used as a single diagnostic measure (Watson, Zimmerman, Cohen, & Dominik, 2009).

The GDS has high correlations with the Zung Self-Rating Depression Scale (SDS) and the Hamilton Rating Scale for Depression (HRS-D) (.84 and .83 respectively); further evidence of validity.

Sensitivity (true positives) and specificity (true negatives) with a cutoff of 11 were 84% and 95% respectively, and they were 80% and 100% respectively at a cutoff of 14; providing evidence for scores of 11+ to be considered a possible indicator of depression.

Validity and reliability are unaffected by pertinent individual difference factors such as age, education, gender, race, and culture (Marc, Raue & Bruce, 2008; Rait et. al., 1999; Harralson et. al., 2002).

Critical Analysis

Overall, the GDS-LF is a reliable and valid measure of depression in aged individuals. It is…

  • Easy to administer (self-administered or observer)
  • A simple scale to complete (yes/no responses), especially for older adults
  • Useful in a variety of settings; nursing homes and the community, with medical inpatients, medical outpatients, and day-treatment clients
  • Shown to maintain it’s reliability and validity when administered by phone (Burke, Roccaforte, Wengel, Conley & Potter (1995)
  • Adequate in screening mildly demented subjects (McGivney, Mulvihill & Taylor, 1994)

Its few weaknesses include the possibility of over-diagnosing depression (Lesher & Berryhill, 1994), the inclusion of items/terms that could be seen as western value judgments (Sansoni et. al., 2007) and that it is not a useful or valid tool for screening cognitively impaired patients (Holroyd & Clayton, 2000).

It is a useful screening tool for depression in older adults.


Original Source, and Various Translations:

Online Version:


Aikman, G. G., & Oehlert, M. E. (2001). Geriatric Depression Scale: long form versus short form. Clinical Gerontologist, 22(3-4), 63-70.

Burke, W. J., Roccaforte, W. H., Wengel, S. P., Conley, D. M., & Potter, J. F. (1995). The reliability and validity of the Geriatric Depression Rating Scale administered by telephone. Journal of the American Geriatrics Society43(6), 674-679.

Harralson, T. L., White, T. M., Regenberg, A. C., Kallan, M. J., Ten Have, T., Parmelee, P. A., & Johnsons, J. C. (2002). Similarities and differences in depression among black and white nursing home residents. The American journal of geriatric psychiatry, 10(2), 175-184

Holroyd & Clayton (2000). Measuring Depression in the Elderly: Which Scale is Best? Medscape. Retrieved 20/08/2017 from

Lesher, E. L., & Berryhill, J. S. (1994). Validation of the geriatric depression scale‐short form among inpatients. Journal of clinical psychology, 50(2), 256-260.

Marc, L. G., Raue, P. J., & Bruce, M. L. (2008). Screening performance of the 15-item geriatric depression scale in a diverse elderly home care population. The American Journal of Geriatric Psychiatry, 16(11), 914-921.

McGivney, S. A., Mulvihill, M., & Taylor, B. (1994). Validating the GDS depression screen in the nursing home. Journal of the American Geriatrics Society, 42(5), 490-492

Parmelee, P. A., Lawton, M. P., & Katz, I. R. (1989). Psychometric properties of the Geriatric Depression Scale among the institutionalized aged. Psychological Assessment: A Journal of Consulting and Clinical Psychology, 1(4), 331.

Rait, G., Burns, A., Baldwim, R. Morley, M., Chew-Graham, C., St Leger, A. S., & Abas, M. (1999). Screening for depression in African-Caribbean elders. Family practice, 16(6), 591-595

Sheikh, J.I., & Yesavage, J.A. (1986). Geriatric Depression Scale (GDS): Recent evidence and development of a shorter version. Clinical Gerontologist, 5, 165-173.

Watson, L. C., Zimmerman, S., Cohen, L. W., & Dominik, R. (2009). Practical depression screening in residential care/assisted living: five methods compared with gold standard diagnoses. The American Journal of Geriatric Psychiatry, 17(7), 556-564.

Yesavage, J. A., Brink, T. L., Rose, T. L., Lum, O., Huang, V., Adey, M., & Leirer, V. O. (1982). Development and validation of a geriatric depression screening scale: a preliminary report. Journal of psychiatric research, 17(1), 37-49.

General Behavior Inventory (GBI)

The General Behavior Inventory (GBI), first developed by Depue et al. (1981), was designed to identify the presence and severity of depressive and manic/hypomanic symptoms, as well as to assess for cyclothymia in adults. In their attempts to explore predisposition to bipolar disorder, the authors created a behavioural paradigm to identify persons at risk. Though intended for use in an adult population, a slightly modified version of the GBI has demonstrated potential as a parent-report measure of mood symptomatology amongst children and adolescents (Youngstrom, Findling, Danielson, & Calabrese, 2001). In addition, a short version has been developed via factor analysis that allows for it to be a screening tool in both adult and adolescent populations (Youngstrom, Murray, Johnson, & Findling, 2016).

The original self-report includes three dimensions, or subscales, that comprise 73 items on which respondents use a 4-point Likert-type scale (0 = never or hardly ever; 3 = very often/almost constantly) to indicate the frequency with which they experience a behaviour over the past year. The Depression scale sums 45 of the items whilst the Hypomanic/Biphasic scales combined sum 28 items. Questions include: “Have you become sad, depressed, or irritable for several days or more without really understanding why?” and “has your mood or energy shifted rapidly back and forth from happy to sad or high to low?” As suggested by Depue, Krauss, and Spoont (1987), the items may be scored using a dichotomous model. This involves dividing the population into cases and non-cases, where those individuals responding 0 or 1 to an item receive 0 points and those responding 2 or 3 to an item receive 1 point. The scale may also be scored in the traditional Likert fashion, where the responses are merely summed. Whilst higher scores reflect increased psychopathology, it is important to note that the GBI is not a diagnostic tool. Research has indicated that the scales can discriminate between bipolar and disruptive behaviour disorders, unipolar and bipolar depression, and mood and disruptive behaviour disorders or no diagnosis (Danielson, Youngstrom, Findling, & Calabrese, 2003).

The GBI has strong psychometric properties. In a recent evaluation study, it demonstrated excellent internal consistency (Cronbach’s ⍺ over .93 for both subscales; Pendergast et al., 2014). Results from the original validation study suggest the tool has good test-retest reliability (r = .73 over 15 weeks), excellent content validity, excellent construct validity, and excellent discriminative validity (Depue et al., 1981). More recent studies have found the GBI to have excellent discriminant validity (Youngstrom, Genzlinger, Egerton, & Van Meter, 2015) and good treatment sensitivity (Youngstrom et al., 2013).

Evidence has shown that gender differences have not compromised the overall psychometric properties of the GBI (Depue & Klein, 1988). However, Chmielewski and colleagues (1995) compared GBI data for African American, Asian American, Caucasian, and Latino samples, and discovered significant cultural differences – Caucasians scored lower than all other groups. Though two decades later, involving a combined Caucasian and African American sample, Pendergast et al. (2015) found that GBI scores were largely invariant across racial groups.

Free access to the GBI:

Chmielewski, P. M., Fernandes, L. O., Yee, C. M., & Miller, G. A. (1995). Ethnicity and gender in scales of psychosis proneness and mood disorders. Journal of Abnormal Psychology, 104(3), 464-470.

Danielson, C. K., Youngstrom, E. A., Findling, R. L., & Calabrese, J. R. (2003). Discriminative validity of the General Behavior Inventory using youth report. Journal of Abnormal Child Psychology, 31(1), 29-39.

Depue, R. A., & Klein, D. N. (1988). Identification of unipolar and bipolar affective conditions in nonclinical and clinical populations by the General Behavior Inventory. In D. L. Dunner, E. S. Gershon, & J. E. Barrett (Eds.), Relatives at risk for mental disorders (pp. 179- 202). New York: Raven Press.

Depue, R. A., Krauss, S., & Spoont, M. R. (1987). A two-dimensional threshold model of seasonal bipolar affective disorder. In D. Magnusson & A. Ohman (Eds.), Psychopathology: An interactional perspective (pp. 95-123). New York: Academic Press.

Depue, R. A., Slater, J. F., Wolfstetter-Kausch, H., Klein, D. N., Goplerud, E., & Farr, D. A. (1981). A behavioral paradigm for identifying persons at risk for bipolar depressive disorder: A conceptual framework and five validation studies. Journal of Abnormal Psychology, 90, 381-437.

Pendergast, L. L., Youngstrom, E. A., Brown, C., Jensen, D., Abramson, L. Y., & Alloy, L. B. (2015). Structural invariance of General Behavior Inventory (GBI) scores in Black and White young adults. Psychological Assessment, 27(1), 21-30.

Pendergast, L. L., Youngstrom, E. A., Merkitch, K. G., Moore, K. A., Black, C. L., Abramson, L. Y., & Alloy, L. B. (2014). Differentiating bipolar disorder from unipolar depression and ADHD: The utility of the General Behavior Inventory. Psychological Assessment, 26(1), 195-206.

Youngstrom, E. A., Findling, R. L., Danielson, C. K., & Calabrese, J. R. (2001). Discriminative validity of parent report of hypomanic and depressive symptoms on the General Behavior Inventory. Psychological Assessment, 13(2), 267-276.

Youngstrom, E. A., Genzlinger, J. E., Egerton, G, A., & Van Meter, A. R. (2015). Multivariate meta-analysis of the discriminative validity of caregiver, youth, and teacher rating scales for pediatric bipolar disorder: Mother knows best about mania. Archives of Scientific Psychology, 3(1), 112-137.

Youngstrom, E. A., Murray, G., Johnson, S. L., & Findling, R. L. (2016). The 7 Up 7 Down Inventory: A 14-item measure of manic and depressive tendencies carved from the General Behavior Inventory. Psychological assessment, 25(4), 1377-1383.

Youngstrom, E. A., Zhao, J., Mankoski, R., Forbes, R. A., Marcus, R. M., Carson, W., … Findling, R. L. (2013). Clinical significance of treatment effects with aripiprazole versus placebo in a study of manic or mixed episodes associated with pediatric bipolar I disorder. Journal of child and Adolescent Psychopharmacology, 23(2), 72-9.

Calgary Depression Scale for Schizophrenia (CDSS)

Depression is reported to be prevalent in 7–75% of patients with schizophrenia, with an average of 25% (Kim et al., 2006; Müller et al., 2005). During the late 1980’s, depression in schizophrenia generated substantial research attention because of its importance in diagnosis, treatment and long-term outcomes of the disorder. Scales for assessing depression in non-psychotic populations have been criticised for being inappropriate for assessing depression in individuals with schizophrenia.

The Calgary Depression Scale for Schizophrenia (CDSS) is a nine item structured interview scale that was designed in 1990 specifically to assess depression independently of symptoms of psychosis in schizophrenia. Originally an 11 item scale (Donald Addington, Addington, & Schissel, 1990), the CDS was developed from, and validated against, the Hamilton Depression Rating Scale (HDRS), Beck Depression Inventory (BDI), and the Brief Psychiatric Rating Scale (BPRS) using factor analysis, internal consistency, and face validity (Donald Addington, Addington, Maticka-Tyndale, & Joyce, 1992; Donald Addington et al., 1990).

The CDS consists of eight structured questions and a ninth observational item that depends on observation over the course of the interview (Kim et al., 2006). Items were constructed to measure: 1. Depression; 2. Hopelessness; 3. Self deprecation; 4. Guilty ideas; 5. Pathological guilt; 6. Morning depression; 7. Early wakening; 8. Suicidal ideation; and 9. Observed depression.

Items are graded on a 4-point Likert type scale (0, absent; 1, mild; 2, moderate; 3, severe), anchored by descriptors (Donald Addington et al., 1992). Point scores of all nine items are summed to obtain the CDS depression score. A score higher than 6 has an 82% specificity and 85% sensitivity for predicting the presence of a major depressive episode.

Psychometric properties

  • Reliable, valid and specific measure of depression in patients with schizophrenia. Measures depression separately from negative and extrapyramidal symptoms. Low correlation with positive and negative symptoms and no substantial correlation with extrapyramidal symptoms
  • High internal consistency: α = 0.76 – 0.86
  • Good internal and inter-rater reliability:
  • High validity: Ability to predict presence of MDD; 2. Correlation with other depression measures; 3. Confirmatory factor analysis
  • Strong construct validity: Single dimension being measured. Confirmed by correlations with other depression rating scales and prediction of major depressive episode
  • Divergent validity: Absence of correlations with positive negative and extrapyramidal symptoms


  • Used in clinical populations of patients with depression in schizophrenia (DSM-III-R, DSM-IV)
  • Focused on maximising internal and external validity across inpatients and outpatients
  • Has been translated into 40 languages. Validated in: Arabic, Spanish, German, Chinese, Thai, Brazilian, Greek, French


  • Quick to administer
  • Sensitive to change, so can be used at both the acute and residual stages of schizophrenia
  • Superior to the Hamilton Depression Rating Scale (HDRS) and Montgomery-Asberg Scale for differentiating between depression and negative and positive symptoms. All items significantly discriminate between the presence and absence of a major depressive episode
  • Most specific and valid assessment of depression in schizophrenia


  • Scale is designed for use by an experienced rater. It is not intended for self assessment



Addington, D., Addington, J., & Maticka-Tyndale, E. (1991). Reliability and validity of a depression scale for schizophrenics. Schizophrenia Research, 4(3), 247.

Addington, D., Addington, J., & Maticka-Tyndale, E. (1994). Specificity of the Calgary Depression Scale for schizophrenics. Schizophrenia Research, 11(3), 239-244.

Addington, D., Addington, J., Maticka-Tyndale, E., & Joyce, J. (1992). Reliability and validity of a depression rating scale for schizophrenics. Schizophrenia Research, 6(3), 201-208.

Addington, D., Addington, J., & Schissel, B. (1990). A depression rating scale for schizophrenics. Schizophrenia Research, 3(4), 247-251.

Addington, J., Shah, H., Liu, L., & Addington, D. (2014). Reliability and validity of the Calgary Depression Scale for Schizophrenia (CDSS) in youth at clinical high risk for psychosis. Schizophrenia Research, 153(1), 64-67.

Galletly, C., Castle, D., Dark, F., Humberstone, V., Jablensky, A., Killackey, E., Kulkarni, J., McGorry, P., Nielssen, O., Tran, N. (2016). Royal Australian and New Zealand College of Psychiatrists clinical practice guidelines for the management of schizophrenia and related disorders. Australian & New Zealand Journal of Psychiatry, 50(5), 410-472. doi:10.1177/0004867416641195

Kim, S.-W., Kim, S.-J., Yoon, B.-H., Kim, J.-M., Shin, I.-S., Hwang, M. Y., & Yoon, J.-S. (2006). Diagnostic validity of assessment scales for depression in patients with schizophrenia. Psychiatry Research, 144(1), 57-63.

Lançon, C., Auquier, P., Reine, G., Bernard, D., & Toumi, M. (2000). Study of the concurrent validity of the Calgary Depression Scale for Schizophrenics (CDSS). Journal of Affective Disorders, 58(2), 107-115.

Müller, M. J., Brening, H., Gensch, C., Klinga, J., Kienzle, B., & Müller, K.-M. (2005). The Calgary Depression Rating Scale for schizophrenia in a healthy control group: Psychometric properties and reference values. Journal of Affective Disorders, 88(1), 69-74.

Mood Disorder Questionnaire (MDQ)



The Mood Disorder Questionnaire (MDQ) was created by Hirschfeld and colleagues (2000) to address the need for accurately screening individuals with a bipolar spectrum disorder. Accurate identification of bipolar disorder (BD) is of concern as it’s often unrecognised or inaccurately diagnosed, which results in a delay of diagnosis and appropriate treatment (Lish, et al., 1994). Items on the MDQ are derived from the DSM-IV criteria and experience as a clinician (Hirschfeld, 2000).

Clinical Use

Self-report format, around five minutes to complete, not to be used for diagnostic purposes, only as a screening tool, and a comprehensive evaluation should follow a positive screen outcome.

Administration and Scoring

The MDQ consists of 3 questions. First, there are 13 items that examine manic symptoms. Second and third, enquires whether these symptoms identified have co-occurred, and the severity of the symptoms. To screen positive, the individual must have answered ‘yes’ to a minimum of 7 items on question 1, responded ‘yes’ to question 2, and answered ‘moderate problem’ or ‘serious problem’ to question 3.

Development and Psychometric Properties

The MDQ has achieved adequate internal consistency with a Cronbach’s alpha of 0.79 and 0.90 (Hirschfeld, 2000; Isometsä et al., 2003). The validation study administered the MDQ to patients at five psychiatric clinics in the United States (Hirschfeld, 2000). The results were used to determine cut off points for items, specificity, and sensitivity. Findings demonstrated that the MDQ had a 0.73 sensitivity and a 0.90 specificity when contrasted against other screening questionnaires in psychiatric settings. The researchers then conducted testing in a general population, which identified a 0.28 sensitivity and a 0.97 specificity (Hirschfeld, 2002). An additional study assessed the effectiveness of the MDQ in unipolar and bipolar depressive patients and found a 0.58 sensitivity (higher sensitivity for bipolar 1) and a 0.67 specificity (Miller, Klugman, Berv, Rosenquist, Ghaemi, 2004). Lastly, testing in a primary care setting revealed a 0.58 sensitivity and a 0.93 specificity (Hirschfeld, Cass, Holt, Carlson, 2005).

In sum, the MDQ is a useful screening tool for BD, demonstrating validity in clinical settings and across cultures. However, consideration should be given towards its higher sensitivity to detect BD type 1 compared to other BD on the spectrum, and its low sensitivity in general populations. Additionally, the use of differing cutoff points of items in scoring (e.g., standard or modified cutoff value of 7 for question 1), and the inclusion/exclusion criteria (e.g., more defined BD definition/criteria includes more severe cases, and increases sensitivity) has shown variability in sensitivity and specificity thus, limiting its overall effectiveness (Wang, et al., 2015).


Hirschfeld, R. M., Williams, J. B., Spitzer, R. L., Calabrese, J. R., Flynn, L., Keck Jr, P. E., … & Russell, J. M. (2000). Development and validation of a screening instrument for bipolar spectrum disorder: the Mood Disorder Questionnaire. American Journal of Psychiatry157, 1873-1875.

Hirschfeld, R. M. (2002). The Mood Disorder Questionnaire: a simple, patient-rated screening instrument for bipolar disorder. Primary care companion to the Journal of Clinical Psychiatry4, 9.

Miller, C. J., Klugman, J., Berv, D. A., Rosenquist, K. J., & Ghaemi, S. N. (2004). Sensitivity and specificity of the Mood Disorder Questionnaire for detecting bipolar disorder. Journal of Affective Disorders81, 167-171.

Hirschfeld, R. M., Cass, A. R., Holt, D. C., & Carlson, C. A. (2005). Screening for bipolar disorder in patients treated for depression in a family medicine clinic. The Journal of the American board of family practice18, 233-239.

Isometsä, E., Suominen, K., Mantere, O., Valtonen, H., Leppämäki, S., Pippingsköld, M., & Arvilommi, P. (2003). The mood disorder questionnaire improves recognition of bipolar disorder in psychiatric care. BMC psychiatry, 3, 8.

Lish, J. D., Dime-Meenan, S., Whybrow, P. C., Price, R. A., & Hirschfeld, R. M. (1994). The National Depressive and Manic-depressive Association (DMDA) survey of bipolar members. Journal of affective disorders31, 281-294.

Major Depression Inventory (MDI)

The most commonly utilized measures of depression were created prior to the release of the Diagnostic and Statistical Manual of Mental Disorders III (DSM-III) in 1980. Therefore, items on these tests may not be optimal. Consequently, new tools were formulated such as the Major Depression Inventory (MDI) (Cuijpers et. al., 2007). The MDI is a self-rated tool that has a dual function; it can be either a diagnostic instrument that aids in assessing the presence of DSM-IV major depression, or a measure of the degree of depression severity (Bech et. al., 2015).
It was developed by Professor Per Bach and associates in collaboration with the Psychiatric Research Unit of the Danish World Health Organization Collaborative Centre for Mental Health (Konstantinidis et al., 2011 & Psychiatric Times, 2013). It consists of 12 items; Items 8 and 10 involve two sub-items; a and b, all scored on a frequency response scale ranging from “none of the time” (zero) to “all of the time” (five), and is answered in the context of the last 2 weeks. Functionally, it only contains 10 items as only the highest score of either a or b are counted in both Item 8 and 10 (Bech et. al., 2015, Konstantinidis et. al., 2011, & Bech et. al., 2001).

Using the MDI as measure of depression severity: total score of ten items calculated by adding together 10 scores. The total score range is 0-50. 0-20 indicates depression does not exist or its existence is doubtful, 21-25 indicates mild depression, 26-30 indicates moderate depression, and 31-50 indicates severe depression.

Using the MDI as a diagnostic tool: algorithm for DSM-IV diagnosis of major depression; Items 4 and 5 are combined and only the highest answer of the two is considered. The presence of at least 5 of 9 symptoms indicates diagnosis of major depression. Item 1 or 2 must be among the 5 or more symptoms. The clinical range incorporates Items 1 to 3 occurring most of the time or all of the time, and all other symptoms occurring either slightly more than half of the time, most of the time or all of the time. If 5 or more symptoms are in this range, a diagnosis of major depression is supported (Bech et. al., 2015, Konstantinidis et. al., 2011, & Bech, 2011).

Psychometric Properties
Research findings suggest that the MDI possesses good reliability, validity, sensitivity and specificity (Cuijpesr, 2007). Cuijpers and associates (2007) found that the test had good reliability, a substantial correlation with another measure of depressive symptoms, and acceptable specificity and sensitivity. Also, Forsell (2003) found that the MDI has high internal consistency. Furthermore, Olsen and colleagues (2003) found that the tool demonstrated adequate internal and external validity as a measure of depression severity.
In regards to differential diagnosis, the levels of sensitivity and specificity that the MDI has demonstrated across multiple studies indicates that the MDI has the ability to identify individuals who have depression and to identify those who do not. Hence, this test may assist in the process of differential diagnosis (Cuijpers, et. al., 2007).

Strengths of the MDI include: being able to utilize it as a continuous scale indicating level of depression symptoms, and as a method of acquiring an indication of the existence of major depression, the fact that it appears to be a reliable tool for evaluating depression, and that it is brief in nature (Cuijpers et. al., 2007).

Some weaknesses of the MDI: the fact that whilst sensitivity and specificity of the diagnostic algorithm have been found to be acceptable in clinical populations, in general populations sensitivity and specificity have been found to be low (Amris et. al., 2016). Also, further research on the MDI is needed, and the tool was based on the DSM-IV, however this has been superseded by the DSM-5, thus the tool may not be representative of the new DSM.

Some evidence exists to suggest the MDI is reliable and valid across many countries and cultures and across genders (Cuijpers, 2007, Olsen et. al., 2003, Fountoulakis, et. al., 2003, & Konstantinidis et. al., 2011).


Amris, K., Omerovic, E., Danneskiold-Samsoe, B, Bliddal, H., & E. E.Waehrens. (2016). The validity of self-rating depression scales in patients with chronic widespread pain: a Rasch analysis of the Major Depression Inventory. Scandinavian Journal of Rheumatology, 45(3), 236-246. doi: 10.3109/03009742.2015.1067712

Bech, P., Timmerby, N., Martiny, K., Lunde, M., & Soendergaard, S. (2015). Psychometric evaluation of the Major Depression Inventory (MDI) as depression severity rating scale using the LEAD (Longitudinal Expert Assessment of All Data) as index of validity. BMC Psychiatry , 15(90) , 1-7. doi: 10.1186/s12888-051-0529-3.
Bech, P., Rasmussen, M.A., Raabaek Olsen , L., Noreholm, V., & Abildgard, W. (2001). The sensitivity and specificity of the Major Depression Inventory, using the Present State Examination as the index of diagnostic validity. Journal of Affective Disorders, 66(2001), 159-164.

Cuijpers, P., Dekker, J., Noteboom, A., Smits, N., & Peen, J. (2007) Sensitivity and specificity of the Major Depression Inventory in outpatients. BMC Psychiatry, 7(39), 1-6. doi:10.1186/147-244X-7-39

Forsell, Y. (2005). The Major Depression Inventory versus Schedules for Clinical Assessment in Neuropsychiatry in a population sample. Social Psychiatry, 2005(40), 209-213. doi:10.1007/z00127-005-0876-3

Fountoulakis, K.N., Iacovides, A., Kleanthous, S., Samolis, S., Gougoulias, Kaprinis, GS, & Bech, P. (2003) Reliability, validity and psychometric properties of the greek translation of the Major Depression Inventory. BMC Psychiatry,3(2), 1-8.

Konstantinidis, A., Martiny, K., Bech, P., & Kasper, S. (2011). A comparison of the Major Depression Inventory (MDI) and the Beck Depression Inventory (BDI) in severely depressed patients. International journal of psychiatry in clinical practice, 15(1), 56-61. doi: 10.3109/13651501.2010.507870

Olsen, L.R., Jensen, D.V., Noerholm, V., Martiy, K & Bech, P. (2003). The internal and external validity of the Major Depression Inventory in measuring severity of depressive states. Psychological Medicine, 2003(33), 251-356. doi: 10.1017/SOO33291702006724.

Psychiatric Times. (2013, April). MDI. Retrieved from

Centre for Epidemiological Studies Depression Scale for Children (CES-DC)


Formulated by Weissman, Orvaschel and Padian (1980) in the United States of America (USA), the Centre for Epidemiological Studies Depression Scale for Children (CES-DC) is one of the most widely used self-report inventories for the screening of depressive symptoms and the assessment of symptom improvement in children and adolescents aged 6- 17 years (Essau et al., 2013). The 20- item questionnaire was derived from the Centre for Epidemiological Studies Depression Scale for Adults (CES- D; Radloff, 1977), with the items later modified to encourage use with youths (Essau, 2013; Faulstich, Carey, Ruggiero, Enyart, & Gresham, 1986; Weissman et al., 1980). For instance, CES-D item “I felt like everything I did was an effort” was modified in the CES-DC to “ I felt like I was too tired to do things” (Essau, 2013). Possible scores range from 0- 60 and are calculated by summing scores from each question (Radloff, 1977; Weissman et al., 1980).

All items on the CES-DC are rated on a 4-point Likert scale in terms of its frequency of occurrence throughout the past week. A score of 0 indicates “not at all”; a score of 1 indicates “a little”; a score of 2 indicates “some”; and a score of 3 indicates “a lot” (Radloff, 1977; Weissman et al., 1980). To control for response bias, four items (questions 4, 8, 12 and 16) are worded positively and thus are scored in the reversed order when calculating total CES -DC scores; a score of 3 indicates “ not at all”; a score of 2 indicates “a little ”; a score of 1 indicates “some”; and a score of 0 indicates “a lot” (Radloff, 1977; Essau, 2013). Additionally, higher CES-DC scores correspond to greater levels of depressive symptoms (Radloff, 1977; Essau, 2013). Weissman et al. (1980) specify that a cut-off score of 15 is indicative of depressive symptomatology in children and adolescents. Taken together, youths who report scores of 15 or more may be experiencing significant levels of depressive symptoms and thus should be followed by further assessment (Weissman et al., 1980). Subsequently, based on the practitioner’s clinical judgement, further assessment is necessary for youths who exhibit symptoms of depression but do not screen positive (Weissman et al., 1980).

Researchers have implemented the CES-DC in both clinical and community settings, in addition to a number of differing cultures, including the USA (Faulstich et al., 1986; Fendrich, Weissman, & Warner, 1990; Weissman et al., 1980), Iran (Essau et al., 2013), Germany (Barkmann, Erhart, & Schulte-Markwort, 2008; Bettge et al., 2008), Spain (Aguilar & Berganza, 1990), Sweden (Olsson & von Knorring, 1997) and China (Li, Chung & Ho, 2010). As research has identified discrepancies in the severity of depressive symptoms amongst children and adolescents residing in different countries, it is plausible that culture may play an important role (Essau et al., 2013). For instance, Iranian females have proclivities to be viewed as being increasingly submissive provided their living conditions, approach in which they are socialised and the nation’s male-dominated societal constructs, hence potentially contributing to their higher reports of depressive symptoms (Essau et al., 2013). Following this perspective, studies that have implemented the CES -DC have indexed that girls tend to report significantly higher levels of depressive symptoms and impairments in daily functioning than boys, particularly somatic complaints (Bettge, et al., 2008; Essau et al., 2013; Fendrich, Weissman, & Warner, 1990; Olson & von Knorring, 1997). These gender discrepancies may be attributed to the psychological and social difficulties being more demanding for females, such as issues with body image and self-esteem (Essau et al., 2013). Moreover, Bettge and researchers (2008) identified significantly greater levels of depressive symptoms in




adolescents (12-17 years) than children (6-11 years). It is postulated that adolescents are potentially more sensitive to the description and perception of their depressive symptoms, even though such manifestations may not be intense enough to be classified as clinically significant (Bettge et al., 2008).

One of the major advantages of the CES-DC is that it is publically accessible with no cost affiliated with it (Essau et al., 2013). Subsequently, the psychometric properties of the CES-DC are renowned for their good reliability (Essau et al., 2013). The internal consistency of the CES-DC has been documented to range from good to excellent, with Cronbach’s alphas spanning from .71 to .91 (Barkmann et al., 2008; Essau et al., 2013; Li et al., 2010). Similarly, the CES-DC has evidenced strong test-retest reliability in adolescent samples (12 -18 years; Barkmann et al., 2008; Li et al., 2010), with coefficients ranging from .70 to .85. Conversely, given that it has been proposed that the CES-DC appears to measure state more than trait characteristics (Faulstich et al., 1986), test-retest reliability as the propensity to be negatively impacted. Germane to this, Faulstich et al. (1986) discovered that test- retest reliability for the CES-DC was very poor for children. This may be attributed to the wording or format of the tool, as it may surpass the comprehension level of young children (Faulstich et al., 1986). Collectively, when drawing conclusions from CES-DC data collected from clinical samples, results should be interpreted with caution.

A number of studies have also assessed the validity of the CES-DC. Researchers have examined the convergent validity of the CES- DC through the calculation of inter-correlational analyses with other inventories used for assessment of depressive symptoms in youths, such as the Children Depression Inventory (CDI; Doerfler, Felner, Rowlinson, Raley, & Evans, 1988) and the Beck Depression Inventory (BDI; Faulstich et al., 1986). Results have indexed significant correspondence between related concepts, suggesting that the CES-DC measures the same depressive constructs the CDI and BDI assess (Doerfler et al., 1988; Faulstich et al., 1986). Additionally, Achenbach (1979) reported that the CES-DC correlated significantly with Child Behaviour Checklist (CBCL), indicating that greater levels of depressive symptoms are related to greater levels of behavioural and emotional problems. Furthermore, Li et al. (2010) found a significant positive correlation between the Chinese version of the CES-DC and the State Anxiety Scale for Children (SACS; Li & Lopez, 2007), proposing that children experiencing heightened anxiety symptoms also report greater depressive symptoms. Germane to discriminant validity, studies have shown that the CES-DC is able to distinguish between children and adolescents presenting with or without psychiatric diagnoses (Fendrich et al., 1990; Weissman et al., 1980).

Radloff (1977) identified four factors when designing the CES-D, encompassing depressed affect, positive affect, somatic activity and interpersonal functioning. Confirmatory factor analyses demonstrate that these factors have also been replicated in various studies implementing the CES- DC amongst children and adolescent samples in different countries (Bettge et al., 2008; Essau et al., 2013; Fendrich et al., 1990; Li et al., 2010; Olson & von Knorring, 1997). However, as the CES-DC does not align with the Diagnostic and Statistical Manual of Mental Disorders (DSM) diagnostic criteria, the CES-DC cannot be used as a diagnostic tool or for differential diagnoses encompassing depressive disorders (Faulstich et al., 1986). Similarly, the instrument is considered to be more sensitive to a subject’s emotional distress rather then their depressive symptoms (Faulstich, 1986). Collectively, further investigation is warranted if aggregated CES-DC scores are determined to be representative of depressive symptomatology (Faulstich et al., 1986; Fendrich, 1990).

Depressive problems continue to be a common occurrence amongst children and adolescents (Bettge et al., 2008). Given that these manifestations tend to go frequently undiagnosed, sufferers are delayed from accessing appropriate treatment (Essau et al., 2013). As such, early identification of clinically depressed children and adolescents is imperative so that suitable treatment can be administered, with much of this dependent on psychometrically rigorous screening instruments (Essau et al., 2013). Accumulating evidence suggests that that CES-DC is considered to be a reliable and valid tool for assessing and monitoring depressive symptoms in youths. However, its results should avoid being interpreted as a clinical diagnosis but instead signify the need for further evaluation. Provided its simplicity of scoring and administration, its implementation within psychiatric domains should be examined in further studies.


Achenbach, T. (1979). The child behavior profile: an empirically based system for assessing children’s behavioral problems and competencies. International Journal of Mental Health, 7, 24-42.

Barkmann, C., Erhart, M., & Schulte-Markwort, M. (2008). The German version of the Centre for Epidemiological Studies Depression Scale for Children: psychometric evaluation in a population-based survey of 7 to 17 years old children and adolescents – results of the BELLA study. European Child and Adolescent Psychiatry, 17, 116-124.

Bettge, S., Wille, N., Barkmann, C., Schulte-Markwort, M., Ravens-Sieberer, U., & BELLA Study Group (2008). Depressive symptoms of children and adolescents in a German representative sample: Results of the BELLA study. European Child & Adolescent Psychiatry, 17, 71-81.

Essau, C.A., Olaya, B., Gholamreza, P., Gilvarry, C., & Bray, D. (2013). Depressive symptoms among young children and adolescents in Iran: A confirmatory factor analytic study of the centre for epidemiological studies depression scale for children. Child Psychiatry and Human Development, 44, 123-136. doi:10.1007/s10578-012-0314-1

Doerfler, L.A., Felner, R.D., Rowlinson, R.T, Raley, P.A., & Evans, E. (1988). Depression in children and adolescents: a comparative analysis of the utility and construct validity of two assessment measures. Journal of Consulting and Clinical Psychology, 56, 769-772.

Faulstich, M.E., Carey, M.P., Ruggiero, L., Enyart, P., & Gresham, F. (1986). Assessment of depression in childhood and adolescence: An evaluation of the center for epidemiological studies depression scale for children (CES-DC). American Journal of Psychiatry, 143(8), 1024-1027.

Fendrich, M., Weissman, M.M., & Warner, V. (1990). Screening for depressive disorder in children and adolescents: Validating the Centre for Epidemiological Studies Depression Scale for Children. American Journal of Epidemiology, 131, 538-551.

Li, H.C.W., Chung, O.K.J., & Ho, K.Y. (2010). Centre for epidemiologic studies depression scale for children: Psychometric testing of the Chinese version. Journal of Advanced Nursing, 66, 2582-2591.

Li. H.C.W., & Lopez, V. (2007). Development and validation of a short form of the Chinese version of the State Anxiety Scale for Children. International Journal of Nursing Studies, 44, 566-573.

Olson, G., & von Knorring, A.L. (1997). Depression among Swedish adolescents measured by the self -rating scale Centre for Epidemiological Studies Depression Sale for Children (CES-DC). European Child and Adolescent Psychiatry, 6, 81-87.

Radloff, L.S. (1977). A CES-D scale: a self-report depression scale for research in the general population. Applied Psychological Measurement, 1, 385-401.

Weissman, M.M., Orvaschel, H., & Padian, N. (1980). Children’s symptom and social functioning self-report scales: Comparison of mothers’ and children’s reports. Journal of Nervous Mental Disorders, 168(12), 736-740.

Depression Anxiety Stress Scale (DASS)

Initially introduced as the DASS – 42, a self-report questionnaire measured on 4 Likert ratings assessing the levels of severity of depression, anxiety and stress. The DASS takes a dimensional approach rather than a categorical (P.F. Lovibond, & S.H. Lovibond,1995; Psychology Foundation of Australia, 2014).
Originally developed at the university of New South Wales, Australia. First developed by using a sample of 1st year psychology students. Shortly after, the revised version of the DASS-21 was developed to reduce administration and test-taker time. The DASS-21 has 7 items designated to each subscale of depression, anxiety and stress.

The DASS has been widely used in both clinical and non- clinical samples and showed excellent reliability and validity across both the clinical and non-clinical samples. Many studies, using factor analysis have confirmed that the items all load accurately making up each subscale of depression, anxiety and stress.  The DASS shouldn’t be used to simply diagnose a person with depression as it’s important to for the clinician to use their clinical judgement and expertise as well.

The tool has been translated in different cultures such as variety of Asian cultures. However, due to it being standardised and developed within a westernised framework, it has been suggested that it’s validity may be comprised when employed in collectivist cultures. This is because collectivist cultures perception of depression, stress and anxiety is somewhat different.

Lovibond, P.F, & Lovibond, S, H. (1995). The structure of negative emotional states: comparison of the depression anxiety stress scales (DASS) with the Beck depression and anxiety inventories. Journal of Behaviour Research and Therapy, 33 (3).
Psychology Foundation of Australia. (2014). Depression anxiety stress scales (DASS). Retrieved from

Vanderbilt Assessment Scales (VAS)

The Vanderbilt Assessment Scales (parent/teacher) were created in 2002 by the American Academy of Pediatrics (AAP) and the National Initiative for Children’s Healthcare Quality (NICHQ) at the completion of a project aimed to create and implement a model of care for children with ADHD. The VAS is a brief scale completed by parents and teachers that assesses ADHD symptoms of inattention and hyperactivity along with conduct disorder, oppositional defiance disorder, anxiety, depression and academic performance (Fields & Hale, 2011).
Psychometric Properties
The scale has good internal reliability with Cronbach’s alpha coefficient of > .90 (parent) and >.89 (teacher) (Wolraich et al., 2002; Wolraich et al., 2013). Test-retest reliabilities were assessed as adequate (r >.80) (Bard et al., 2013). Interrater reliability, between the two scales is very low (r=.27 – .34) (Wolraich et al., 2002).
The four factor structure of the scale confirms it is a valid measure of inattention, hyperactivity, conduct disorder/oppositional defiance disorder, anxiety/depression. Convergent validity is evidenced by the moderate to high correlations with the Diagnostic Interview Schedule for Children-IV Parent Version (Bard et al., 2013; Collett et al., 2003).
The parent scale produced sensitivity measure (true positives) of 80% and specificity (true negatives) of 75% when predicting a diagnosis of ADHD. However when the parent and teacher scales were combined positive predictive value fell to 19% and the negative predictive value increased to 98% suggesting that the combined scale is very good for identifying children who do not have ADHD (Bard et al., 2013).
The VAS has been used with clinical and community samples of American, African American, Hispanic, Spanish and German children in rural, urban and suburban areas; with those at high and low risk of ADHD. Only small differences were found for gender, age, school grade or severity of ADHD symptoms (Wolraich et al., 2002).
The scale is: easy to complete and score, psychometrically sound, useful for collecting data from multiple sources and assessing academic and behaviour performance (Collett et al., 2003; Kratochvil et al., 2009). It can be used to establish baselines to measure treatment effectiveness (Kratochvil et al., 2009) and has utility to screen for comorbid disorders (Becker et al., 2012; Langberg et al., 2010). The teacher scale correlates highly with a diagnosis of ADHD (Austerman, 2015).
There is no evidence found for discriminant validity. Items are more relevant for school aged children than younger. Very low inter-rater reliability between scales and is to be used as a screening tool only.
Clinical utility:
The VAS provides a psychometrically sound method of data collection from both parents and teachers that can be used in the diagnostic process for children with ADHD.  It useful and acceptable to clinicians, readily available and provides assessment of performance and comorbid disorders (Bard et al., 2013).  It is so simple to use, a line can be drawn down the page to delineate meeting or not meeting diagnostic criteria (Molina, 2017).
Link to scale:

American Psychiatric Association (2013). Diagnostic and Statistical Manual of Mental Disorders (5th ed.). Washington, DC: Author.
Austerman, J. (2015). ADHD and behavioral disorders: Assessment, management, and an update from DSM-5. Cleveland Clinic Journal of Medicine, 82, S2-S7.
Bard, D.E., Wolraich, M.L., Neas, B., Doffing, M., & Beck, L. (2013). The psychometric properties of the Vanderbilt attention-deficit hyperactivity disorder diagnostic parent rating scale in a community population. Journal of Developmental and Behavioral Pediatrics, 34, 72-82.
Becker, S.P., Langberg, J.M., Vaughn, A.J., & Epstein, J.N. (2012). Clinical utility of the Vanderbilt ADHD diagnostic parent rating scale comorbidity screening scales. Journal of Development and Behavioral Pediatrics, 33, 221-228.
Collett, B.R., Ohan, J.L., & Myers, K.M. (2003). Ten-year review of rating scales. V: Scales assessing attention-deficit/hyperactivity disorder. Journal of the American Academy of Child & Adolescent Psychiatry, 42, 1015-1037.
Fields, S.A., & Hale, L.R. (2011). Psychoeducational groups for youth attention-deficit hyperactivity disorder: a family medicine pilot project. Mental Health in Family Medicine, 8, 157-165.
Kratochvil, C.J., Vaughan, B.S., Barker, A.M.., Corr, L., Wheeler, A., & Madaan, V. (2009). Review of pediatric attention deficit/hyperactivity disorder for the general psychiatrist. Psychiatric Clinics of North America, 32, 39-56.
Molina Healthcare (2017). Behavioral health provider toolkit. Retrieved from:
National Initiative for Children’s Healthcare Quality (2017). Attention Deficit Hyperactivity Disorder (ADHD) Learning Collaborative. Retrieved from:
National Initiative for Children’s Healthcare Quality (2017). NICHQ Vanderbilt assessment scales. Retrieved from:
Wolraich, M.L., Bard, D.E., Neas, B., Doffing, M., & Beck, L. (2013). The psychometric properties of the Vanderbilt attention-deficit hyperactivity disorder diagnostic teacher rating scale in a community population. Journal of Developmental and Behavioral Pediatrics, 34, 83-93.
Wolraich, M.L., Lambert, W., Doffing, M.A., Bickman, L.B., Simmons, T., & Worley, K. (2003). Psychometric properties of the Vanderbilt ADHD diagnostic parent rating scale in a referred population. Journal of Pediatric Psychology, 28. 559-568.

Revised Children’s Anxiety and Depression Scale (RCADS)

The Revised Children’s Anxiety and Depression Scale (RCADS; Chorpita, Yim, Moffitt, Umemoto & Francis, 2000) is a 47-item self report measure which assesses the frequency of anxiety and depression symptoms in youth aged 8-18 years.  The RCADS was developed in Hawaii, United States and is partly a revision of Spence’s Children’s Anxiety Scale (SCAS; 1997).  The measure has a parent-version form as well as a short-form (RCADS-25; Ebesutani et al., 2012). The RCADS is composed of 6 scales, 5 of which are related to anxiety (separation anxiety disorder, social phobia, generalized anxiety disorder, panic disorder, obsessive compulsive disorder) and another one related to major depressive disorder. The scales are aligned with anxiety and depression diagnosis criteria in the DSM-IV. Individuals rate their answers on a 4-point likert scale ranging from “never” to “always”.  The results can be scored manually or via the scoring software created by the authors. In terms of results, T-scores greater than 65 are borderline clinically significant whereas those above 75 are clinically significant. These T-scores indicate that the individual’s responses reflect anxiety and depression-related symptoms very similar to those of individuals who meet diagnostic criteria for that particular disorder or syndrome.

The RCADS has good internal consistency with Cronbach alpha values ranging from .78 for social anxiety disorder to .88 for panic disorder in a clinical population (Chorpita, Moffitt & Gray, 2005) as well as acceptable internal consistency in the general population (Chorpita et al.,2000).  Furthermore, the measure has good convergent validity with similar measures such as the Revised Children’s Manifest Anxiety Scale (RCMAS; Reynolds & Richmond, 1978), the Children’s Depression Inventory (CDI) and interview dimensional ratings (Chorpita et al., 2005).  The RCADS also has favourable test-retest reliability for most scales with the social phobia scale being most reliable (0.80) and the obsessive compulsive disorder scale generally being the least reliable (0.65) when tested in a community sample of school children and adolescents (Chorpita et al., 2000). In terms of model fit, a study by Chorpita et al. (2005) using confirmatory factor analysis indicated an adequate model fit for a 6-factor model when compared to a 1 factor and a 2 factor model. The RCADS has been successfully validated in several countries including Australia (de Ross, Gullone & Chorpita,2002), Denmark (Esbjorn, Somhovd, Turnstedt & Reinholdt-Dunne, 2010), the Netherlands (Kosters, Chinapaw, Zwaanswijk, van der Wal & Koot, 2015) and Spain (Sandin, Valiente & Chorot, 2009) in clinical and school-based samples.

The RCADS is available publicly and free of cost from It can be used for both educational and professional purposes. However, if you want to use this tool for research purposes, permission is required from the authors. It’s a valuable tool for use with youth suspected of having an anxiety disorder or major depressive disorder as its scales reflect DSM-IV criteria and it’s one of the only anxiety measures that also measures depressive symptoms separately.  Furthermore, the RCADS has been translated into several languages including Spanish, Chinese and French and due to its cross-cultural validations, it can be used with youth from different cultures. It should be noted that the RCADS is only standardized for grades 3 and above as T-Score conversions have not been developed for children younger than grade three. Therefore, the authors recommend using clinical judgement for interpreting raw scores for these children.



Chorpita, B. F., Moffitt, C. E., & Gray, J. (2005). Psychometric properties of the Revised Child Anxiety and Depression Scale in a clinical sample. Behaviour research and therapy43(3), 309-322.

Chorpita, B. F., Yim, L., Moffitt, C., Umemoto, L. A., & Francis, S. E. (2000). Assessment of symptoms of DSM-IV anxiety and depression in children: A revised child anxiety and depression scale. Behaviour research and therapy38(8), 835-855.

de Ross, R. L., Gullone, E., & Chorpita, B. F. (2002). The revised child anxiety and depression scale: a psychometric investigation with Australian youth. Behaviour Change19(02), 90-101.

Ebesutani, C., Reise, S. P., Chorpita, B. F., Ale, C., Regan, J., Young, J., … & Weisz, J. R. (2012). The Revised Child Anxiety and Depression Scale-Short Version: Scale reduction via exploratory bifactor modeling of the broad anxiety factor. Psychological Assessment24(4), 833.

Esbjørn, B. H., Sømhovd, M. J., Turnstedt, C., & Reinholdt-Dunne, M. L. (2012). Assessing the Revised Child Anxiety and Depression Scale (RCADS) in a national sample of Danish youth aged 8–16 years. PLoS One7(5), e37339.

Kösters, M. P., Chinapaw, M. J., Zwaanswijk, M., van der Wal, M. F., & Koot, H. M. (2015). Structure, reliability, and validity of the revised child anxiety and depression scale (RCADS) in a multi-ethnic urban sample of Dutch children. BMC psychiatry15(1), 132.

Reynolds, C. R., & Richmond, B. O. (1978). What I think and feel: A revised measure of children’s manifest anxiety. Journal of abnormal child psychology6(2), 271-280.

Sandín, B., Valiente, R. M., & Chorot, P. (2009). RCADS: evaluación de los síntomas de los trastornos de ansiedad y depresión en niñosy adolescentes. Revista de Psicopatología y Psicología Clínica14(3), 193-206.

Spence, S. H. (1998). A measure of anxiety symptoms among children. Behaviour research and therapy36(5), 545-566.

Brief Problem Checklist (BPC)

The Brief Problem Checklist (BPC) is a measure designed by Chorpita et al. (2010) to periodically assess the clinical progress of a child over the course of psychological treatment. The scale measures internalising and externalising problems found in children aged 7-13, and such feedback can be used by a clinician to track outcomes and to adjust treatment. The scale is presented in an interview format which contains twelve items, and there is both a child and caregiver version. The BPC is intended to be conducted via an over-the-phone interview at weekly intervals during treatment. The burden for families partaking in such frequent interviews is believed to be minimal, as Chorpita et al. (2010) found that on average the administration time takes less than one minute.

The measure was developed in the USA, and the normed sample was composed of American children (aged 7 -13 years old) who were offered treatment due to a range of problems that could be subsumed under the broad categories of anxiety, depression, or disruptive behaviour (Chorpita et al., 2010). The BPC interviews yield three scales; a Total Problems scale, and Internalising scale, and an Externalising scale. When answering the questions, children and care-givers are required to rate how true the 12 items are in reference to the previous week, using a 3-point Likert scale. Example items include: “I disobey my parents or people at school” (caregiver version: “disobedient at home or school”) and “I threaten to hurt people” (caregiver version: “threatens people”).

To create items for the BPC, Chorpita et al. (2010) applied factor analysis to both the Child Behaviour Checklist (CBCL) and the Youth Self-Report (YSR); two instruments which are widely used and established as evidence-based measures. Items with high factor loadings across both the CBCL and YSR were chosen, leading to the identification of 14 internalising items and 20 externalising items. Items were then selected based on their ability to maximise information pertaining to the clinical change of the client. The resultant 12 items were subjected to exploratory factor analysis using maximum likelihood estimation. A two-factor solution was drawn from the following scree plot, and the factors were extracted and subjected to promax rotation. The resultant factors corresponded to the Externalising and Internalising scales of the BPC.

To determine convergent validity the scales of the BPC were correlated with corresponding scales from the CBCL and YSR (Chorpita et al., 2010). Each BPC scale (Internalising, Externalising, and Total Problems) was highly significantly correlated to scales on both the YSR (coefficients at .61 or above) and the CBCL (coefficients at .56 or above). A longitudinal examination of BPC interview data across 6 months of treatment demonstrated that the BPC is capable of significantly predicting change in related measurements of symptomology (ie. the CBCL and YSR), providing strong evidence for its clinical utility. Test-retest coefficients across an average period of 8-9 days ranged from .72 to .79 for each of the BPC subscales. The agreement of the child and caregiver versions of the scale were examined and produced correlations ranging from .19 to .31; findings which are comparable to the literature comparing parent-child symptom agreement (Chorpita et al., 2010).

The BPC can be readily accessed online and is available for both commercial and research purposes. Due to the relatively new construction of the measure and its potential to quickly and efficiently monitor clinical change in children, the BPC holds both great practical relevance and would benefit from further psychometric testing and cross-cultural validation. No major revisions of the BPC have occurred to the author’s knowledge.


Chorpita, B. F., Reise, S., Weisz, J. R., Grubbs, K., Becker, K. D., Krull, J. L., & The Research Network on Youth Mental Health. (2010). Evaluation of the Brief Problem Cheklist: Child and caregiver interviews to measure clinical progress. Journal of Consulting and Clinical Psychology, 78(4), 526-536. doi: 10.1037/a0019602

BPC Links:

Child version: Parent Version: