Effectiveness of early versus delayed rehabilitation following total shoulder replacement: A systematic review

Objective To investigate the effectiveness of early versus delayed rehabilitation following total shoulder replacement. Design Intervention systematic review with narrative synthesis. Literature search MEDLINE, EMBASE, CINAHL, Scopus and the Cochrane Library were searched from inception to the 29th of July 2021. Study selection criteria Randomised controlled trials comparing early versus delayed rehabilitation following primary anatomic, primary reverse, or revision total shoulder replacement. Data synthesis A revised Cochrane risk of bias assessment tool for randomised controlled trials was used, as well as the Grading of Recommendations Assessment, Development and Evaluation approach to evaluate the quality of evidence. A narrative synthesis was undertaken. Results Three eligible randomised controlled trials (n = 230) were included. There was very low-quality evidence of no statistically significant difference (P > 0.05) in pain, shoulder function, health-related quality of life or lesser tuberosity osteotomy healing at 12 months between early or delayed rehabilitation. There was conflicting and very low-quality evidence of a difference between the effect of early and delayed rehabilitation on shoulder range of movement. There was limited, very low-quality evidence of statistically significantly improved pain and function (P < 0.05) in the early post-operative period with early rehabilitation following anatomic total shoulder replacement. Conclusions No differences were seen in patient-reported or clinician-reported outcomes at 12 months post-surgery between early and delayed rehabilitation following total shoulder replacement. There is very low-quality evidence that early rehabilitation may improve shoulder pain and function in the early post-operative phase following anatomic total shoulder replacement.


Introduction
Total shoulder replacement is a treatment option for individuals experiencing severe pain and functional restriction of their shoulder. Two of the main indications for total shoulder replacement are osteoarthritis and cuff tear arthropathy. 1 Between 1998 and 2017, over 58,000 elective shoulder replacements were undertaken in the United Kingdom (UK), and this number is rising annually. 2 A similar trend has been reported in the United States (US), 3 resulting in significant annual costs to both the UK National Health Service (NHS) 4 and the US healthcare system. 5 Following total shoulder replacement, a programme of rehabilitation is important to help patients achieve the best clinical and quality of life outcomes. 6,7 However, the optimal approach to post-operative rehabilitation is unknown. 8 Variation is observed across the UK NHS 9 and few institutions adapt their rehabilitation protocol based on prosthesis type 9 despite the common assertion that different approaches are necessary. 8 One key uncertainty regarding rehabilitation relates to the best time to begin mobilisation of the shoulder after surgery. This has recently been identified as a research priority by the UK National Institute for Health and Care Excellence. 10 A previous systematic review highlighted the paucity of high-quality evidence available relating to rehabilitation following total shoulder replacement. 11 Only one randomised controlled trial was eligible for inclusion, which suggested that early rehabilitation might lead to a more rapid return of function following total shoulder replacement. 12 Since this previous systematic review, 11 further relevant randomised controlled trials have been published. The aim of this review was therefore to build on the previous systematic review and examine evidence from randomised controlled trials to investigate the effectiveness of early versus delayed rehabilitation on pain, function, and tissue healing in adults who have undergone primary anatomic, primary reverse, or revision total shoulder replacement.

Methods
This systematic review was prospectively registered on PROSPERO (ID number: CRD420 20208472 available from: https://www.crd.york. ac.uk/prospero/display_record.php?RecordID=20 8472) and reported according to the PRISMA statement. 13 Included randomised controlled trials were selected according to the following eligibility criteria: Adults undergoing formal post-surgical rehabilitation following either elective primary anatomic or reverse total shoulder replacement, or elective revision total shoulder replacement; intervention involving early rehabilitation after total shoulder replacement (defined by the randomised controlled trial authors); comparator involving delayed rehabilitation after total shoulder replacement (defined by the randomised controlled trial authors); and outcomes consisting of any patient reported outcomes related to pain, disability, or function, and/or any objective measurement of range of movement, strength or tissue healing reported in the post-operative period.
A comprehensive literature search was undertaken via key databases: (MEDLINE, EMBASE, CINAHL, Scopus and the Cochrane Library) from inception until the 29th of July 2021 to identify relevant randomised controlled trials. Reference lists of potentially eligible studies were hand-searched. The grey literature was searched via OpenGrey and ClinicalTrials.gov. The search was limited to papers published in English. An example of the MeSH terms and keywords used for the searches are shown in Table 1. Full details of the search strategy are included in supplementary file 1.
All retrieved articles were imported into EndNote Online, and duplicates removed. Following this, the studies were uploaded to rayyan.qcri.org to enable independent screening of the titles and abstracts by two reviewers (MM and CL). Full texts of potentially eligible studies were independently reviewed by two reviewers (MM and CL) to determine inclusion. Disagreements were resolved through discussion.
Risk of bias assessment was undertaken by two independent reviewers (MM and GW) using the revised Cochrane risk-of-bias tool for randomised controlled trials (ROB-2). The risk of bias of all relevant outcomes was assessed individually. 14 The ROB-2 includes five domains: (1) the randomisation process, (2) deviations from the intended intervention, (3) missing outcome data, 4) measurement of the outcome, and (5) selection of the reported result. Each domain was classified as 'low risk', 'some concerns' or 'high risk'. 14 Disagreements were resolved through discussion.
The Grading of Recommendations Assessment, Development and Evaluation (GRADE) system was used to rate the quality of evidence for each outcome. The initial GRADE assessment was undertaken by the first author (MM). This was reviewed and verified by a second author (CL).
The quality of the evidence was downgraded when design limitations, indirectness of evidence, unexplained heterogeneity or inconsistency of results, imprecision of results, or publication bias were observed. 15 Publication bias was not formally assessed using a funnel plot as no meta-analyses were possible. 16 Data were extracted by the first author (MM) and entered into a pre-approved form in Microsoft Excel. This was then verified by a second author (PG). If the data provided in the published paper were deemed insufficient to facilitate statistical analysis, corresponding authors were contacted via email to request additional information. If no response was received after two weeks, a reminder email was sent. If no subsequent response was received, no further contact was made.
Where available, descriptive statistics were used to summarise variables extracted in the review. Included RCTs and responses from authors were reviewed to determine whether the data available and the level of clinical and methodological heterogeneity enabled a formal meta-analysis. If a meta-analysis was deemed inappropriate, a narrative synthesis would be undertaken.

Results
The process of study selection is summarised in Figure 1. No disagreements occurred between reviewers regarding eligibility. One disagreement occurred during the risk of bias assessment relating to a single signalling question on the Cochrane risk-of-bias tool for randomised controlled trials, for the outcome of 'healing of the lesser tuberosity osteotomy'. This was resolved through discussion.
None of the three published papers provided sufficient information to facilitate a metaanalysis. Corresponding authors were contacted to request access to the required data. Two authors did not respond despite two separate requests being sent. A narrative synthesis was therefore undertaken. Table 2 provides an overview of the characteristics of the three included randomised controlled trials. 230 participants were recruited across the three trials: 60 (26%) had an anatomic total shoulder replacement with a lesser tuberosity osteotomy 12 and 170 (74%) had a reverse total shoulder replacement without repair of the subscapularis. 17,18 204 patients were included in the final analyses; 100 were female and 104 were male. The mean age of participants ranged from 68.3 to 75.1 years.

Characteristics of included studies
The follow-up time points varied ( Table 2). Early post-operative outcomes (less than three months) were recorded in two randomised controlled trials; 12,17 one randomised controlled trial recorded the first post-operative measurements at three months. 18 Each included trial recorded outcomes at 12 months after the surgery 12,17,18 and one recorded final outcomes at two years. 17

Rehabilitation protocols
The definitions of 'early' and 'delayed' rehabilitation varied across the three randomised controlled trials. Table 3 summarises the rehabilitation protocols employed, demonstrating the differences within and between the included trials.

Outcome measures
The heterogeneity in the choice of outcome measures across the three included trials is demonstrated in Table 2.

Outcomes Pain
The two randomised controlled trials that measured pain as an outcome used visual analogue scales. Edwards et al. 18 used a zero to ten scale; Denard and Lädermann 12 did not provide such detail. No statistically significant differences between groups were found at three, six or twelve months in either study. Denard and Lädermann however found a statistically significant difference between groups at eight weeks favouring the early rehabilitation group (P = 0.019, point estimates not provided). 12 Such early outcomes were not recorded by Edwards et al. 18 Grading of evidence: There is very low-quality evidence that early rehabilitation improves pain compared with delayed rehabilitation in the early post-operative period.

Composite measures of shoulder pain and function
Each included randomised controlled trial used the American Shoulder and Elbow Surgeon's Score (ASES) as a composite measure of shoulder pain and function. Denard and Lädermann 12 reported statistically significantly superior composite ASES scores in the early rehabilitation group at four weeks and eight weeks following surgery (P = 0.025 and P = 0.010 respectively). Point estimates were not provided therefore it is not known whether the between-group difference exceeded the minimum clinically important difference. In contrast, Hagen et al. 17 reported no statistically significant difference between groups in change in American Shoulder and Elbow Surgeon's Score from baseline to six weeks (P value and point estimates not reported). At six months post-surgery, Edwards et al. 18 found no statistically significant difference between the groups; Hagen et al. 17 however reported a difference in change in composite scores from baseline favouring the delayed rehabilitation group (mean improvement 40.2 + /− SD 20.1 in the delayed therapy group, 30.0 + /− SD 18.8 in the early therapy group, P = 0.038). This difference is smaller than the minimum clinically important difference of 21 points. 19 Grading of evidence: There is very low-quality conflicting evidence evaluating the effect of early and delayed rehabilitation on composite outcomes of shoulder pain and function.

Shoulder function
Both Denard and Lädermann 12 and Edwards et al. 18 used the Single Assessment Numeric Evaluation (SANE) to assess shoulder function, but only Edwards et al. 18 provided point estimates for SANE scores. The only statistically significant difference between the groups was found by Denard and Lädermann at eight weeks, in favour of the early rehabilitation group (P = 0.012, point estimates not provided). 12 Denard and Lädermann 12 reported no statistically significant difference in Simple Shoulder Test (SST) scores between the groups at 12 months post-surgery (Mean SST score 9.9 + /− SD 2.5 early rehabilitation group versus mean SST score 9.8 + /− SD 2.4 delayed rehabilitation group, P = 0.376). Similarly, Edwards et al. 18 reported no statistically significant between-group differences in change from baseline for the Shoulder Activity Scale, Constant-Murley, or the Global Shoulder Function scores at any of the time points recorded.
Grading of evidence: There is very low-quality conflicting evidence evaluating the effect of early and delayed rehabilitation on shoulder function.

Health-related quality of life
Only Edwards et al. 18 specifically measured health-related quality of life, using the four-dimension version of the Assessment of Quality of Life (AQOL-4D). No statistically significant differences in change from baseline were seen between the early and delayed groups at three, six or 12 months (mean change from baseline 65.7 + /− SD 19.3 early group, 71.6 + /− SD 20.4 delayed group P = 0.942; 69.5 ± SD 24.3 early group, 78.3 ± SD 14.8 delayed group, P = 0.886; and 68.6 ± SD 18.0 early group, 77.7 ± SD 17.7 delayed group, P = 0.653 respectively).
Grading of evidence: There is very low-quality evidence of no difference in improvement in health-related quality of life up to 12 months following total shoulder replacement with early or delayed rehabilitation.

Range of movement
Each of the three randomised controlled trials measured active range of movement in all participants. Denard and Lädermann 12 used a goniometer to measure shoulder range of movement into forward flexion and external rotation with the patient's arm at their side; internal rotation was estimated using the highest spinal level reached. Edwards et al. 18 reported use of a goniometer to measure forward flexion, abduction and external rotation with the patient in a supine position, and assessed internal rotation by the highest spinal level reached. Hagen et al. 17 recorded both active and passive range of movement into forward flexion, abduction, external rotation and cross-body adduction, but the method of measurement was not stated.
Denard and Lädermann 12 found no statistically significant difference between groups at 12 months for forward flexion, external rotation and internal rotation (Mean forward flexion 142°± SD 20°early rehabilitation group, mean 146°± SD 20°delayed group, P = 0.886, mean external rotation 62°± SD 16 early rehabilitation group, mean 57°± SD 12 delayed group, P = 0.209, and mean internal rotation L3 early rehabilitation group, L1 delayed group, P = 0.685 respectively). Hagen et al. 17 did not present descriptive data, however reported no statistically significant differences in change in range of movement from baseline between groups at any time point. Edwards et al. 18 reported that within-group time effects were observed for both groups from three to six and 12 months post-surgery, and post-hoc t-tests revealed significantly better forward flexion measurements in the early rehabilitation group at three months post-surgery (additional data Grading of evidence: There is very low-quality conflicting evidence evaluating the effect of early and delayed rehabilitation on shoulder range of movement up to 12 months post-surgery.

Peak isometric shoulder strength
Edwards et al. 18 measured peak isometric shoulder muscle strength in kilograms using a digital handheld dynamometer. The authors state that no significant between-group differences or interaction effects were observed for forward flexion, abduction, external rotation or internal rotation peak isometric strength at three, six, or 12 months post-surgery (P > 0.05, point estimates not provided in the published paper).
Grading of evidence: There is low-quality evidence suggesting that early and delayed rehabilitation result in no significant difference in peak isometric shoulder strength at three, six, and 12 months post-surgery.

Healing of the lesser tuberosity osteotomy
Denard and Lädermann 12 reported no statistically significant difference (P = 0.101) in healing rates of the lesser tuberosity visible on plain X-Ray in the delayed (96.4%) versus early rehabilitation group (81.5%) at 12 months.
Grading of evidence: There is very low-quality evidence demonstrating no difference in healing of the lesser tuberosity osteotomy at 12 months following early or delayed rehabilitation.

Discussion
This systematic review was undertaken to assess the effectiveness of early versus delayed postoperative rehabilitation following total shoulder replacement. There was substantial heterogeneity in the interventions employed and insufficient data to allow statistical pooling of results via meta-analysis. The available evidence, which is of very low to low methodological quality, suggests no difference in patient-reported or clinicianreported outcomes at 12 months postsurgery. 12,17,18 There is however some limited, very low-quality evidence of improved pain and functional outcomes scores 12 in the first eight weeks post-surgery in those undergoing early rehabilitation following anatomic total shoulder replacement.
Across the three included randomised controlled trials, the definitions of early rehabilitation varied considerably, with differences noted in sling use and the timeframes for commencement of active shoulder movement and strengthening. 12,17,18 Furthermore, what was termed 'early' rehabilitation in the included randomised controlled trials (e.g. beginning passive movement on the first postoperative day) would, in some settings, be considered standard practice. 9 Variability was seen in the choice of patient reported outcome measures selected, with just the American Shoulder and Elbow Surgeon's Score used consistently across the three included In each online database, advanced search options were selected to ensure MESH terms and keywords were searched for in article title, abstract and keywords. 3 months, 6 months, 12 months post-surgery ASES, VAS-P, GSF, SANE, AQOL-4D and SAS presented as mean within group change (95% CI) for ER and DR groups and result of analyses of between group difference for improvement from baseline.
NB. No statistically significant between-group differences were seen at any time point.    randomised controlled trials. Only Edwards et al. 18 opted to measure health-related quality of life in addition to shoulder specific functional outcomes. Yet sling immobilisation, shoulder pain, and limited shoulder range of movement may all have a wider impact on the patient than joint-specific measures can detect. 20 Future research should consider a more holistic approach and include outcome measures that capture the impact on social function and health-related quality of life that shoulder pain and restriction can create. In this review, randomised controlled trials that included participants undergoing both anatomic and reverse total shoulder replacement were included despite recommendations by some that the rehabilitation programs for the two procedures should differ. 6,8 A recent review of publicly available rehabilitation protocols published by UK NHS Trusts reported that few institutions differentiate rehabilitation protocols based on prosthesis type. For those that do, there is little difference between the two approaches, 9 hence the decision to include what some may consider to be two clinically distinct patient groups.
One reason commonly proposed for a more conservative postoperative approach following anatomic total shoulder replacement is to protect the subscapularis repair or lesser tuberosity osteotomy. 8 A functioning subscapularis is thought to be important to maintain the balance of the rotator cuff. 21 A structurally, or functionally, deficient rotator cuff can increase the shear forces and edge-loading of an anatomic shoulder replacement, increasing the likelihood of early glenoid component loosening and eventual revision surgery. 22 For this reason a more conservative rehabilitation approach, with a period of strict shoulder immobilisation, is often recommended. 8 Denard and Lädermann 12 reported lower rates of lesser tuberosity osteotomy healing in the early rehabilitation group but this difference did not reach statistical significance. Despite this, healing rates need to be carefully monitored in future adequately powered randomised controlled trials.
Finally, this review found that there is limited, very low-quality evidence from one RCT that early rehabilitation may result in better functional outcome scores at four weeks and eight weeks following anatomic total shoulder replacement compared to delayed rehabilitation. Point estimates were not provided in the published paper therefore it is not known whether the between-group difference exceeded the minimum clinically important differences for the chosen outcome measures. Nonetheless, recent research has demonstrated that improved post-operative function following total shoulder replacement is associated with higher patient satisfaction. 23,24 Improved functional independence also results in reduced patient frustration as reliance on others for self-care is reduced. 20 An earlier return to function may therefore be an important outcome for patients even if the advantages of early rehabilitation are no longer visible by three-month post-surgery. The impact of early rehabilitation during the first eight weeks following surgery therefore warrants further exploration in future high-quality randomised controlled trials. Understanding patients' perspectives on early post-surgical recovery through further qualitative research would also be useful.
The inclusion of three eligible randomised controlled trials in this review highlights the paucity of evidence available to guide clinical practice and underscores the need for further research into this area. Furthermore, the heterogeneity of the interventions included in this review illustrates the clinical uncertainty surrounding rehabilitation after total shoulder replacement. This review has also demonstrated the variability in the use of shoulderrelated outcome measures which precludes statistical pooling of results. Although this might be problematic, standardisation of outcome measures is not always desirable, and outcome selection should be directed by what matters to patients and their communities.
The strengths of this systematic review include pre-publication of the study protocol; adherence to methodological guidance, including an in-depth assessment of the quality of available evidence; and the inclusion of two independent reviewers at every stage of the review. This systematic review was however limited to randomised controlled trials published in English, which limits the scope. There was also insufficient data available to allow a meta-analysis and only a small number of trials eligible for inclusion. The definitions of early and delayed rehabilitation employed in each study varied, as did the outcome measures chosen.
Future research should focus on the uncertainties surrounding the optimal time point at which to begin mobilisation of the shoulder following surgery and how long the post-operative sling should be worn. Understanding patients' perspectives on early post-surgical recovery through further qualitative research would also be useful in helping to prioritise patient-important outcomes in the early post-operative period.

Clinical messages
• This systematic review found no differences in patient-reported or clinician-reported outcomes at 12 months post-surgery between early rehabilitation and delayed rehabilitation following total shoulder replacement. • There is some limited, very low-quality evidence of improved pain and functional outcomes in the early post-operative phase with early rehabilitation following anatomic total shoulder replacement.

Author contributions
MM and CL assessed eligibility for inclusion; MM and PG undertook data extraction; MM and GW undertook the risk of bias assessment; MM, CL and BM led the data analysis, but all authors commented. All authors contributed to and approved the final draft of the manuscript.