Comments on Blix et al. ST waveform analysis vs. cardiotocography alone

Abstract – Blix et al. ST waveform analysis vs. cardiotocography alone for intrapartum fetal monitoring: A systematic review and meta-analysis of randomized trials. Acta Obstet Gynecol Scand. 2015 Nov 26. doi: 10.1111/aogs.12828. (Epub ahead of print)

Some topics in obstetrics are more hot than others and we are in the remarkable situation that there are now more meta-analyses (MAs) than randomized controlled trials (RCTs) on the evidence for using fetal ECG ST analysis for fetal surveillance in labor. Previously six MAs (Neilson, 2012; Becker et al., 2012; Potti & Berghella, 2012; Salmelin et al., 2013; Schuit et al., 2013; Olofsson et al., 2014) on five RCTs (Westgate et al., 1993; Amer-Wåhlin et al., 2001; Ojala et al., 2006; Vayssière et al., 2007; Westerhuis et al., 2010) have been published, and after publication of the sixth RCT (Belfort et al., 2015) Blix et al. have presented the seventh MA comparing cardiotocography (CTG) plus ST analysis (STAN) with CTG alone (Blix et al., 2015).

However, the MA by Blix et al. (2015) is different from previously published MAs as it is a sequential MA where each included trial is regarded as an interim MA and the risks for type I and type II errors are controlled (Copenhagen Trial Unit, 2011). A trial sequential analysis combines the cumulated sample sizes of all included RCTs to estimate the threshold of statistical significance, thereby quantifying the statistical reliability with adjusting for significance levels for sparse data and repetitive testing on accumulated data (Copenhagen Trial Unit, 2011). It also provides information regarding the need for additional trials and the required sample size for such trials. In those aspects, the MA by Blix and co-workers is welcome.

In their meta-analysis including more than 26,000 deliveries, Blix et al. (2015) found that STAN monitoring results in a significant 41% reduction in the need for fetal scalp blood sampling, a significant 8% reduction in overall operative deliveries, and a significant 36% reduction in the rate of metabolic acidosis. The quality of evidence was high to moderate. Interestingly, these figures are similar to those presented by Olofsson et al. in a recent critical reappraisal of the first five MAs on STAN vs. CTG alone (Olofsson et al., 2014), where the corresponding figures were significant reductions by 36%, 7% and 39%, respectively. A vast majority of obstetricians would regard these figures beneficial for the STAN methodology, but not Blix and co-workers. They argue that the use of relative effect sizes is misleading in the perception of the magnitude of the effect, and hence they disregard the 8% reduction in operative delivery rate as being of minor importance. We disagree. The potential of avoiding 8% of operative deliveries would result in several hundreds of fewer operations also in countries with a tradition of fairly low intervention rates, like Scandinavia.

The 36% reduction in metabolic acidosis is dismissed by Blix et al. with the argument that the relation between metabolic acidosis and harder endpoints such as perinatal and neonatal death, neonatal encephalopathy, seizures and neurological sequelae like cerebral palsy, is questionable. Like in a previous commentary (Blix & Øian, 2014) they cite a Swedish follow-up study where 89.4% of term newborns with metabolic acidosis at birth were alive without any neurological sequalae at age 6.5 years (Hafström et al., 2012). However, they disregard that 10.6% of acidotic neonates were dead or neurologically handicapped, compared to 3.3% in the control group (Hafström et al., 2012). We are well aware that it is the combination of metabolic acidosis and neonatal encephalopathy that is the high risk (Hafström et al., 2012) and that normal cord blood gases are not incompatible with hypoxic brain injury (Schifrin et al., 2015), but it is metabolic acidosis and not encephalopathy that is positively and promptly diagnosed at birth. Authors Blix and Øian have themselves quite recently used metabolic acidosis as an outcome parameter in clinical research (Bernitz et al., 2011) and when they now write that “there is a known relationship between low cord artery pH values and serious outcome” but then “discourage excessive emphasis on the positive results [of STAN monitoring] for metabolic acidosis” (Blix et al., 2015), they end up in a vicious cycle of argumentation: it is an indisputable fact that the majority of childhood neurological handicaps are not caused by events during birth (Rei et al., 2015) and therefore cannot be prevented by any method for intrapartum fetal surveillance, and that disasters occurring before the STAN monitor is connected can neither be diagnosed nor prevented by fetal ECG ST analysis. It is also an undisputable fact that a fetus suffering severe oxygen deficit will develop hypoxia and will eventually switch to anaerobic metabolism and develop metabolic acidosis and run an increased risk of developing irreversible injury. It cannot be misunderstood that STAN monitoring is for the benefit of those fetuses developing hypoxia in labor and not for the already neurologically injured fetuses, or for those having an impaired vitality of other reasons than hypoxia.

Blix et al. used the GRADE (Grading of Recommendations Assessment, Development and Evaluation) tool to evaluate the quality of evidence in their MA. GRADE is an evaluation system assigned to assess the strength of evidence of a method, to create an “evidence profile”, and to recommend how to use the method (Roback & Carlsson, 2009). The approach is structured, but the recommendation how to use the method is much based on the researchers’ subjective judgements and beliefs (Roback & Carlsson, 2009). Recommendations based on a GRADE assessment are therefore much of an expert statement, and the statement is very much dependent on the composition of the GRADE committee (Roback & Carlsson, 2009).  In that aspect, authors Blix and Øian are known for their skepticism to STAN (Blix & Øian, 2014; Øian & Blix, 2014) but we don’t know the objectivity of the other three authors to the article.

Since the GRADE recommendation is based on the evidence profile it does not take into account the variations in norms, culture, economy and other preferences in different societies (Roback & Carlsson, 2009). In contrast to Blix and co-workers’ strong and universal advise against the use of STAN (Blix et al., 2015), the GRADE methodology does not to any great extent facilitate the development of neither unequivocal nor customized recommendations (Roback & Carlsson, 2009). For that reason, a GRADE evaluation might not be valid in different societies unless it is performed nationally or regionally (Roback & Carlsson, 2009). For example, the Swedish Council on Technology Assessment in Health Care use the GRADE system only to develop an evidence profile and not for recommendations (SBU, 2014). Moreover, in the evidence profile of GRADE different types of studies should be included (Roback & Carlsson, 2009), but Blix et al. only included RCTs and not observational studies in their MA. The evidence profile in GRADE should include how the method performs in the clinical situation, i.e. data from observational studies should be included (Roback & Carlsson, 2009). There are several clinical studies showing beneficial effects of STAN (Welin et al., 1007; Kale et al., 2008; Norén & Carlsson, 2010; Kessler et al., 2013) and Blix et al. admit that “it is possible that the inclusion of well-designed observational studies could lead to more decisive conclusions” (Blix et al., 2015). It is then of utmost importance that the confidence in a statement from experts should not be disturbed by a selection in evidence materials, or by biased experts being pros or cons a method. Different stakeholders must be involved in the GRADE process (Roback & Carlsson, 2009), but we find the recommendations by Blix et al. neither to be based on complete data nor to be unbiased.

The English version of Neoventa Medical´s statement regarding article in the Norwegian paper: Dagens Medisin

Referring to article in the Norwegian paper: Dagens Medisin 

Referring to article with Dr. Jörg Kessler, Haukeland University Hospital, in the Norwegian paper: Dagens Medisin

REFERENCES

Amer-Wåhlin I, Hellsten C, Norén H, Hagberg H, Herbst A, Kjellmer I, et al. Cardiotocography only versus cardiotocography plus ST analysis of fetal electrocardiogram for intrapartum fetal monitoring: a Swedish randomised controlled trial. Lancet. 2001;358:534–8.

Becker JH, Bax L, Amer-Wåhlin I, Ojala K, Vayssiére C, Westerhuis MEMH, et al. ST analysis of the fetal electrocardiogram in intrapartum fetal monitoring. A meta-analysis. Obstet Gynecol. 2012;119:145–54.

Belfort MA, Saade GR, Thom E, Blackwell SC, Reddy UM, Thorp JM Jr, et al. for the Eunice Kennedy Shriver National Institute of Child Health and Human Development Maternal–Fetal Medicine Units Network. A Randomized Trial of Intrapartum Fetal ECG ST-Segment Analysis. N Engl J Med. 2015;373:632-41. doi: 10.1056/NEJMoa1500600

Bernitz S, Rolland R, Blix E, Jacobsen M, Sjøborg K, Øian P. Is the operative delivery rate in low-risk women dependent on the level of birth care? A randomised controlled trial. BJOG 2011;118:1357–1364.

Blix E, Øian P. Deviations from STAN guidelines are frequent but results cannot be excluded when the effectiveness of the method should be evaluated. Acta Obstet Gynecol Scand. 2014;93:589.

Blix E, Brurberg KG, Reierth E, Reinar LM, Øian P. ST waveform analysis vs. cardiotocography alone for intrapartum fetal monitoring: A systematic review and meta-analysis of randomized trials. Acta Obstet Gynecol Scand. 2015, doi: 10.1111/aogs.12828 [Epub ahead of print].

Copenhagen Trial Unit 2011. User Manual for TSA. http://www.ctu.dk/tsa/files/tsa_manual.pdf  (Accessed December 6, 2015).

Kale A, Chong Y-S, Biswas A. Effect of availability of fetal ECG monitoring on operative deliveries. Acta Obstet Gynecol Scand. 2008;87:1189–93.

Kessler J, Moster D, Albrechtsen S. Intrapartum monitoring of high-risk deliveries with ST analysis of the fetal electrocardiogram: an observational study of 6010 deliveries. Acta Obstet Gynecol Scand. 2013;92:75–84.

Neilson JP. Fetal electrocardiogram (ECG) for fetal monitoring during labour. Cochrane Database Syst Rev. 2012;(4):CD000116.

Norén H, Carlsson A. Reduced prevalence of metabolic acidosis at birth: an analysis of established STAN usage in the total population of deliveries in a Swedish district hospital. Am J Obstet Gynecol. 2010;202:546.e1–7.

Ojala K, Vääräsmäki M, Mäkikallio K, Valkama M, Tekaya A. A comparison of intrapartum automated fetal electrocardiography and conventional cardiotocography – a randomised controlled study. BJOG. 2006;113:419–23.

Olofsson P, Ayres-de-Campos D, Kessler J, Tendal B, Yli BM, Devoe L. A critical appraisal of the evidence for using cardiotocography plus ECG ST interval analysis for fetal surveillance in labor. Part II: the meta-analyses. Acta Obstet Gynecol Scand. 2014;93:571–86.

Potti S, Berghella V. ST waveform analysis versus cardiotocography alone for intrapartum monitoring: a meta-analysis of randomized trials. Am J Perinatol. 2012;29:657–64.

Rei M, Ayres-de-Campos D, Bernardes J. Neurological damage arising from intrapartum hypoxia/acidosis. Best Pract Res Clin Obstet Gynecol. (2015), http:/dx.doi.org/10.1016/j.bpobgyn.2015.04.011. [Epub ahead of print].

Roback K, Carlsson P. Evidensgraderingssystemet GRADE. Ett sätt att granska vetenskaplig kunskap om metoder och arbetssätt i hälso- och sjukvården. CMT Rapoort 2009:4. http://liu.diva-portal.org/smash/get/diva2:297894/FULLTEXT01.pdf (Accessed December 6, 2015).

Salmelin A, Wiklund I, Bottinga R, Brorsson B, Ekman-Ordeberg G, Eneroth Grimfors E, et al. Fetal monitoring with computerized ST analysis during labor: a systematic review and meta-analysis. Acta Obstet Gynecol Scand. 2013;92:28–39. SBU. Utvärdering av metoder i hälso- och sjukvården: En handbok. 2 uppl. Stockholm: Statens beredning for medicinsk utvärdering (SBU); 2014. http://www.sbu.se/upload/ebm/metodbok/sbushandbok.pdf (Accessed December 6, 2015).

Schifrin BS, Soliman M, Koos B. Litigation related to intrapartum fetal surveillance. Best Pract Res Clin Obstet Gynaecol (2015), http://dx.doi.org/10.1016/j.bpobgyn.2015.06.007 [Epub ahead of print].

Schuit E, Amer-Wåhlin I, Ojala K, Vayssière C, Westerhuis MEMH, Marsál K, et al. Effectiveness of electronic fetal monitoring with additional ST analysis in vertex singleton pregnancies beyond 36 weeks of gestation: an individual participant data meta-analysis. Am J Obstet Gynecol. 2013;208:187. e1–13.

Vayssière C, David E, Meyer N, Haberstich R, Sebahoun V, Roth E, et al. A French randomized controlled trial of ST-segment analysis in a population with abnormal cardiotocograms during labor. Am J Obstet Gynecol. 2007;197:299.e1–6.

Welin A-K, Norén H, Odeback A, Andersson M, Andersson G, Rosén KG. STAN, a clinical audit: the outcome of 2 years of regular use in the city of Varberg, Sweden. Acta Obstet Gynecol Scand. 2007;86:827–32.

Westerhuis MEMH, Visser GHA, Moons KGM, van Beek E, Benders MJ, Bijvoet SM, et al. Cardiotocography plus ST analysis of fetal electrocardiogram compared with cardiotocography only for intrapartum monitoring. A randomized controlled trial. Obstet Gynecol. 2010;115:1173–80.

Westgate J, Harris M, Curnow JSH, Greene KR. Plymouth randomized trial of cardiotocogram only versus ST waveform plus cardiotocogram for intrapartum monitoring in 2400 cases. Am J Obstet Gynecol. 1993;169:1151–60.

Øian P, Blix E. Scarce scientific evidence for the use of cardiotocography plus fetal ECG ST interval analysis (STAN). Acta Obstet Gynecol Scand. 2014;93:570.

# #