A woman bends down to pet a young calf; the mother cow stands by

Improvement in Functioning for Residents at Spring Lake Ranch

A Program Evaluation Report

Prepared by Kim Van Orden, PhD
Associate Professor, University of Rochester Medical Center
SLR House Advisor 2002-2003


Since its inception in 1932, Spring Lake Ranch has provided a strengths-based approach to helping individuals with mental health and addiction issues to grow and thrive that is grounded in a person-centered and recovery-focused alternative to the biomedical model of mental illness. Founded with the belief that life in community is healthy and healing, Spring Lake Ranch (SLR) Residents have an opportunity to recover and develop their skills to maintain recovery through peer support, professional expertise, living in a diverse, accepting community, and through the daily activities of work on a 700-acre Vermont farm. As part of its ongoing effort of quality improvement and a desire to better understand the impact of the SLR program, leadership of Spring Lake Ranch undertook a major effort beginning in 2019 to evaluate its effectiveness utilizing a validated assessment tool for domains of functioning relevant to recovery from mental illness. This paper provides an initial review of the results from two years of data collection and provides recommendations for continuing program evaluation and quality improvement. Specifically, the objectives of this report are three-fold: 1) to describe the degree of improvement in functioning demonstrated by residents at SLR from admission to discharge; 2) to describe domains of functioning most responsive to SLR programming; 3) to explore potential areas for future program evaluation and enhanced quality improvement strategies.


Sample Description

Data from 33 residents admitted to SLR between December 2019 and July 2021 for whom ratings of functioning were available at admission and discharge were examined. Residents had a mean age of 28 years (standard deviation 5.46, range of 19 to 45 years) and were mostly male (75%, with one non-binary resident). Most residents have diagnoses of severe mental illness (SMI) and many have co-occurring mental health and substance use diagnoses.

The Daily Living Activities (DLA-20) for Adult Mental Health

The DLA-20 is a 20-item observer-rated scale assessing 20 domains of functioning relevant to recovery from severe mental illness (SMI). The DLA-20 was designed to be a brief and easy-to-administer rating scale suitable to administration by case managers, including both professional and paraprofessional raters (Scott & Presmanes, 2001). Each domain is rated from 1 to 7, with higher scores indicating higher levels of function. The scale developers recommend interpreting DLA-20 total scores as global ratings of ‘Level of Functioning’ (LOF; MTM Services ‘Outcomes’ document), with LOF of 1 indicating extremely severe functional impairment, LOF of 2 indicating serious-severe functional impairment, LOF of 3 indicating moderate impairment, LOF of 4 indicating mild impairment, and LOF of 5 or greater indicating no difficulty. A peer-reviewed study examined the psychometric properties (e.g., reliability) and construct validity (i.e., does the scale measure what it is intended to measure?) in two samples of adults with SMI: 85 adults receiving a wide range of treatments in community mental health programs administered by the DeKalb Community Service Board in Georgia and 886 adults with SMI receiving outpatient community mental health services administered by the Georgia Mountains Community Services. Both agencies provide a range of outpatient services across a range of treatment intensity, including assertive community treatment, case management, partial hospitalization, supportive community housing, and psychosocial rehabilitation. Results from the validation study supported the feasibility of administering the DLA-20 by community mental health staff, with high inter-rater reliability evidenced across two independent raters (intraclass correlation [ICC] = .83) based on an interpretation guideline for psychiatric rating scales of ICC 0.5 – 0.7 indicating medium reliability and greater than 0.7 indicating high reliability (Petho et al., 2007). However, the validation sample indicated variability in ICC across DLA domains, with the lowest reliability for the substance use domain (ICC=.56). Several domains demonstrated reliability below 0.7 (n=12 domains), with the remaining indicating high reliability (n=8 domains). However, this analysis included only n=85 patients, suggesting a need for additional study to confirm these estimates. Few peer-reviewed publications report studies using the DLA-20 in program evaluations and research studies, indicating a need for additional study of the clinical utility and psychometric properties of this scale.


For ratings used in this report, staff were trained in administering and scoring the DLA-20 in 2019 and all were certified by MTM Services consulting, which conducts official training in the DLA-20; inter-rater reliability was used to ensure that staff were reliably administering and scoring the scale (using 2 raters per resident). Residents were rated on the DLA-20 at admission and again at discharge. Some residents were also rated throughout their stay, but due to missing data, those data are not included in this report.

Data Analytic Strategy

This report used two strategies to examine the degree of improvement in functioning.

First, global (overall) improvement in functioning was examined using global score calculation and interpretation strategies recommended by the scale developers and MTM Services using the LOF score described above. MTM Services suggest that program evaluations can assess meaningful improvement in functioning from programs using the following metrics:

  • proportion of residents with improvement in LOF of >=0.5 points, with a benchmark of success of at least 35%;
  • proportion of residents with change in severe impairment indicated by examining change among those with LOF less than 4 at admission who increase their functioning above scores of 4, with a benchmark of greater than 35%;
  • proportion of residents with LOF scores indicating greater than moderate impairment at admission (LOF 3 or less) in one or more critical domains (Health Practices, Communication, Safety, Nutrition, Substance Use, Sexual Health, and Hygiene) who demonstrate only mild impairment (LOF 4 or greater) in all critical domains at follow-up, with a benchmark of 20%. This metric has not been calculated for this report but will be calculated in the future.

Second, improvement in specific domains of functioning were examined using the standardized response mean (SRM) effect size. The SRM represents the ratio of the mean change to the standard deviation of that change and is a form of Cohen’s effect size index useful for indicating responsivity to an intervention given that it takes into account variability in change over time and is thus more conservative than scores using the standard deviation of the baseline score (Revicki et al., 2006). A SRM of 0.3 is considered the lower limit for clinically meaningful change in health outcomes (Revicki et al., 2006), with an SRM of at least 0.4 associated with change in studies that examined functioning (O’Carroll et al., 2000). Thus, for this analysis, domains with SRMs greater than 0.40 demonstrated clinically meaningful improvement at discharge. We examined the average SRM across all 20 domains and calculated the proportion of residents with SRMs greater than 0.40 for each domain.


Global Functioning at Admission

Total scores representing an average of functional impairment across 20 domains were calculated for residents with no missing scores on any of the domains at admission (n=31) and discharge (n=30). The average functional impairment score at admission was consistent with mild impairment for patients with severe mental illness (4.04, std 1.17), with a range from severe to no difficulty (1.7 to 5.95). The proportion of residents with at least moderate impairment at baseline was 42% (n=14, 42.42).

Change in Global Functioning

For metric 1, we examined the proportion of residents with an improvement in LOF of at least 0.5 points. For residents with complete data (n=29), the average degree of improvement was 1.21 points (std 1.19), with a range from -0.95 (decline in functioning) to 4.3 (large improvement). The majority of residents demonstrated clinically-significant improvement in global functioning (n=22, 75.86), with 76% exceeding the benchmark of 35%. For metric 2, the overall proportion of residents with at least moderate impairment declined from 42% at admission to 15% at discharge (n=5, 15.15%). Among those with at least moderate impairment at baseline (n=14), four remained in the moderately impaired range and 10 improved (i.e., 71.43%); this proportion of 71% responders exceeds the benchmark of 35%.

Domain-Specific Functioning

Residents demonstrated serious-severe impairment in functioning (i.e., average scores less than 4.10) on most domains of functioning at baseline: self-management of health conditions, housing stability and housekeeping skills, communication skills, time management, money management, problem solving skills, leisure skills, use of community resources, social skills, productivity, and coping skills. The domain with the lowest functioning score at admission was productivity, which measures the degree to which residents were able to work, volunteer, or attend school prior to admission. Domains in which residents were rated as having only mild-moderate levels of functional impairment were safety, nutrition, family relationships, substance use, appropriate sexual behavior, behavior norms, personal hygiene, grooming, and selecting appropriate clothing; severe impairments in these domains would be consistent with a severe level of psychiatric symptomology most likely requiring a higher level of care (e.g., inpatient psychiatric care); thus, these scores are consistent with the levels of functioning required to participate in the SLR program. The finding of only moderate impairment in family relationships may be due to the fact that most residents are connected with SLR due to family member involvement and thus individuals with more impaired family relationships might not seek out care at SLR. Only one domain—selecting appropriate clothing—was rated as non-impaired at admission (i.e., average score of 5 or greater).

Change in Domain-Specific Functioning

Residents demonstrated clinically-meaningful improvement in all domains of functioning, as evidenced by effect sizes (standardized response mean) with a magnitude greater than 0.40 (considered the minimum effect size associated with clinically meaningful benefit; see Table 1). Table 1 orders the domains by degree of improvement, with domains demonstrating the most improvement at the top: the domains with the greatest magnitude of improvement were self-management of health conditions, communication skills, time management, productivity, money management, problem solving, and leisure skills. Several domains that demonstrated lower degrees of improvement were domains for which residents were less impaired at admission (i.e., scores greater than 4), thus leaving less room for growth in those domains (appropriate dress, behavior norms, personal hygiene, appropriate sexual behavior, and substance use).


Regarding global improvement in functioning, data indicate that SLR programming far exceeded recommended benchmarks for successful mental health programs for adults with severe mental illness.

Regarding the profile of improvement across specific domains of functioning, the domains with the greatest magnitude of improvement (self-management of health conditions, communication skills, time management, productivity, money management, problem solving, and leisure skills) are domains of functioning that are directly targeted by the therapeutic programming at SLR, suggesting that improvement is likely due to participation in the SLR program, rather than passage of time. In particular, the structure of the work program and scheduled leisure and recreational activities help residents regulate their social rhythms, provides opportunities for behavioral activation, and allows for productive engagement in a community that becomes routine. Examination of potential psychological mechanisms that account for improvement could be useful in future program evaluations to rule out alternative explanations for improvement.

These data also suggest several future directions for continued program evaluation. First, the safety domain of this instrument may be challenging, but essential, to administer in the context of SLR for which residents are working on a farm and using potentially dangerous tools. This item assesses the frequency with which residents are able to safely use tools, small appliances, sharps, ovens/stoves, and other potential safety hazards, as well as their judgment in making safe decisions. It would be useful to better understand how these ratings are made by staff and ensure that those making ratings are doing so consistently. Staff could be encouraged to provide brief notes regarding reasons for selecting ratings on this domain, as well as any difficulties in making ratings due to variability within this domain – for example, residents could demonstrate high functioning within the home, but demonstrate challenges during the work program and this variability might be masked by the overall rating. Changes in ratings in this domain in particular may be due to gathering additional information about residents as they engage in the work program; Given the importance of this domain for safe functioning outside the community, the fact that the average score for this domain remains in the mildly impaired range suggests that this domain should be studied further to determine whether enhanced programming might be needed to maximize recovery and optimal functioning for residents, or whether improvements in how this domain is assessed are needed (and lack of improvement could be accounted for by potential limitations with this assessment).

Given residents’ lack of access to substances at SLR, interpreting the substance use domain is also challenging and additional information on how raters selected scores for that domain could be useful. Of note is that in the validation study, this domain demonstrated the lowest level of inter-rater reliability; thus, the low degree of improvement in this domain could also be due to difficulties with reliably assessing this domain. It may be useful to supplement these ratings with validated, brief self-report measures of urges to use substances and actual usage, including re-administration of these measures once residents return to the community and have access to substances.

Behavior norms and social skills also appeared to be less responsive to SLR programming, which could be due to the challenge of behavioral interventions in modifying these functional domains in serious mental illness, but could also indicate an area for more precise assessment, as the DLA-20 may not provide the best assessment tool for this domain; alternatively, re-administration of this measure two-weeks after admission may be useful as a more accurate reflection of baseline functioning (once staff have more information to inform ratings). These data could also indicate that enhanced programming addressing these domains could be useful; for example, structured therapeutic groups in social skills training for psychotic disorders have shown benefit and could help residents optimize their engagement with the SLR community outside of groups.

While functional impairment is a key aspect of recovery and rehabilitation from SMI, it is not the only metric that could be associated with participation in the SLR community. Quality of life, which is best assessed via self-report is also a key dimension. Future program evaluation and quality improvement efforts could consider administering self-report scales that capture health-related quality of life (e.g., the World Health Organization Quality of Life Brief Scale) and/or life satisfaction could provide useful outcome measures to supplement the DLA-20. Norms and administration information for the WHOQOL in the U.S. are available through the University of Washington. Self-report of psychiatric symptoms could also be useful. A self-report scale that assesses several domains of functioning (physical and social function, pain interference) as well as psychiatric symptoms (depression, anxiety) and physical health symptoms (fatigue, pain, sleep) is the PROMIS 29-item Profile Self-report version. PROMIS (Patient Reported Outcomes Measurement System) are a set of brief self-report scales developed by the National Institutes of Health with strong validity data that are sensitive to change and associated with population norms that allow for ease of interpretation.

Continued collection of DLA-20 scores will be useful; re-analysis of these data once additional ratings are available (i.e., at least 50 residents) would be useful. At this point, it could be useful to consider publishing results in a peer-reviewed journal if program development grants (e.g., SAMHSA grants) were of interest to obtain funding for additional program evaluation or supplemental clinical programming (e.g., social skills training groups) for which funding would be needed to train staff, obtain program materials, and collect outcomes data.

Table 1. Functional impairment and degree of improvement across 20 domains

Domain Admission Discharge Effect size (SRM)
Health practices 3.58 (1.30) 5.03 (1.51) 1.06
Communication 3.85 (1.42) 4.97 (1.33) 1.06
Time management 3.64 (1.73) 5.33 (1.34) 1.00
Productivity 2.48 (1.42) 4.27 (1.63) 0.99
Money management 3.28 (1.46) 4.59 (1.54) 0.97
Housing stability 3.36 (1.88) 5.12 (1.38) 0.89
Nutrition 4.24 (1.68) 5.48 (1.28) 0.89
Leisure 3.58 (1.62) 4.95 (1.20) 0.88
Problem solving 3.55 (1.25) 4.91 (1.44) 0.84
Family relationships 4.48 (1.35) 5.39 (1.17) 0.75
Coping skills 3.42 (1.28) 4.61 (1.54) 0.74
Use of community resources 3.88 (1.92) 5.09 (1.33) 0.72
Grooming 4.94 (1.41) 5.82 (1.10) 0.72
Social network 4.06 (1.84) 5.18 (1.61) 0.61
Personal hygiene 4.94 (1.52) 5.76 (1.06) 0.6
Appropriate sexual behavior 4.70 (1.53) 5.5 (1.41) 0.57
Behavior norms 4.67 (1.78) 5.61 (1.34) 0.52
Safety 4.70 (1.53) 5.55 (1.64) 0.51
Appropriate dress 5.24 (1.41) 5.84 (1.02) 0.49
Substance use 4.31 (2.04) 5.36(1.19) 0.45



O’Carroll, R. E., Smith, K., Couston, M., Cossar, J. A., & Hayes, P. C. (2000, Feb). A comparison of the WHOQOL-100 and the WHOQOL-BREF in detecting change in quality of life following liver transplantation. Qual Life Res, 9(1), 121-124. https://doi.org/10.1023/a:1008901320492

Petho, B., Tusnady, G., Vargha, A., Tolna, J., Farkas, M., Vizkeleti, G., Toth, A., Szilagyi, A., Bitter, I., Kelemen, A., & Czobor, P. (2007, Jul). Validity of reliability: comparison of interrater reliabilities of psychopathological symptoms. J Nerv Ment Dis, 195(7), 606-613. https://doi.org/10.1097/NMD.0b013e318093f45d

Revicki, D. A., Cella, D., Hays, R. D., Sloan, J. A., Lenderking, W. R., & Aaronson, N. K. (2006, Sep 27). Responsiveness and minimal important differences for patient reported outcomes. Health Qual Life Outcomes, 4, 70. https://doi.org/10.1186/1477-7525-4-70

Scott, R. L., & Presmanes, W. S. (2001, May). Reliability and validity of the Daily Living Activities Scale: A functional assessment measure for severe mental disorders. Research on Social Work Practice, 11(3), 373-389. https://doi.org/Doi 10.1177/104973150101100306

CARF Accredited: Spring Lake Ranch programs are CARF accredited. The CARF accreditation signals our commitment to continually improving services, encouraging feedback, and serving the community.

Spring Lake Ranch is a member of the American Residential Treatment Association (ARTA). ARTA members are dedicated to providing extraordinary care to adults with mental illness.