I recently read this blog post, which presents and expands on Angus Deaton and Nancy Cartwright’s recent thoughts about RCTs. It is worth a read because, in my view, it is an excellent case study of the misunderstanding and misrepresentation of the the claims made in favour of RCTs. Deaton and Cartwright must take the lion’s share of the blame here, not the author of the post, though he does add to the commentary some startling assertions. Here are two of many:
RCTs do not have external validity.
A key argument in favour of randomization is the ability to blind both those receiving the treatment and those administering it.
Both assertions are just plain wrong. Although the author does demonstrate a more nuanced understanding of the value of randomised trials elsewhere:
The results of RCTs must be integrated with other knowledge, including the
practical wisdom of policy makers if they are to be usable outside the context in which they were constructed.
In a response to a comment I made on his post suggesting that Deaton’s and Cartwright’s arguments unfairly target randomised trials, when the criticisms they make are equally applicable to all kinds of intervention research, the author responded thusly:
I think that this argument fails to acknowledge the single defining feature of a randomised trial and also misrepresents what is claimed for them by people who the author has called ‘randomistas’. Moreover, what Deaton considers to be the position of so called ‘randomistas’, specifically, is irrelevant (or at best thoughtless) when his criticisms are not actually of randomised trials but of all types of research.
My response to the above comment is reproduced below.
You may be correct about how Deaton views ‘randomistas’, but if so, he really needs give examples of people claiming that the results of RCTs are superior to results of obtained using other methods. I am a proud ‘randomista’ and I work with a lot of people who might be classified as such, and the idea that people like me say that the results of RCTs are always superior to alternative methods is just not a familiar one. In fact when reading reports of RCTs it is common to find loads of caveats about the findings.
People who understand what RCTs are and what they are not know that the only unique feature of the design is that they generate comparison groups by randomly allocating cases to conditions. That’s it.
I don’t think it is controversial for ‘randomistas’ to argue that this is the best way of generating comparison groups that differ only as a result of the play of chance, rather than as a result of some systematic (non-random) characteristic. In any population there will be things that we know and can measure (so for example we could deliberately match cases based on these factors – say age, gender, or test scores). But there are also things that might be relevant that we don’t or can’t know about our participants and therefore can’t take into account when generating comparison groups. If we accept that there are things that we don’t or can’t know about our participants, then the only way around it, if you want to create probabilistically similar groups, is to use random allocation. Random allocation thus acknowledges and accounts for the limitations of our knowledge.
So, the notion of ‘superiority’ centres around the question ‘how confident am I that the groups being compared were similar in all important known and unknown (and possibly unknowable) characteristics?’
Of course, if your research question is one that does not involve comparisons and causal description then RCTs are not appropriate. You would be hard pressed to find a ‘randomista’ arguing that you need an RCT to help understand the views or opinions of a population of interest, for example. In addition you will be unlikely to find a ‘randomista’ arguing that you need an RCT when observational studies have reported very dramatic effects. Take for example the tired old chestnut about not needing an RCT to find out if parachutes work. 99.9% of people who do not open their parachutes after jumping out of a plane die. This is a highly statistically significant finding and is extremely dramatic. There is no need to go beyond observation here.
Unfortunately for us, the effects of interventions in the social sciences are rarely so dramatic. Therefore, one key element in making casual inferences is ensuring that when we compare alternative interventions or approaches we are, in the best way we know how, comparing like with like. This means that any differences in outcome that we observe between groups can be more confidently attributed to the interventions being compared rather than to an effect of non-random differences between groups.
That’s the strength of an RCT.
I am at the BERA 2016 conference this week. For an unapologetic empiricist, its a bit of an odd place to be.
Yesterday I attended a small seminar at the conference, convened by BERA’s Practitioner Research Special Interest Group (SIG). There has been a lot of talk recently about the challenges faced in helping teachers to engage with research. A combination of lack of support from senior management, unhelpful writing styles, limited access to journal articles, and perfunctory or non-existent research methods courses in initial teacher education means that teachers tend not to engage fully with the research that is there to help inform what they do. This is in contrast to other, comparable, professions. For example, as a part of the everyday responsibility of being a nurse or a doctor there is the expectation that one will not only keep up to date with relevant research but be actively involved in new research. Not so in teaching.
It was, therefore, fascinating to hear at the seminar about a postgraduate degree delivered in Wales for newly qualified teachers, a part of which reflected the norms for nurses and doctors by helping these teachers to develop their research literacy and engagement. This Masters in Education Practice degree ran over three years and culminated in a research project designed and implemented by the teacher and guided by research mentors.
We were told about one such project where a teacher used an action research approach to explore a method of giving feedback to her students on their work. The project involved eight students in one class, who had been classified as having Additional Learning Needs (Wales’ equivalent of SEN), and ran for six sessions. It appeared to be a great success. The teacher was very happy with the results of her research enquiry and felt that she had tapped into a way to improve the outcomes for her students. At the end of her presentation she described what she felt were the limitations of her study and implications for further research.
I felt that this was a brilliant introduction to research for this teacher. She was clearly extremely switched on and reflective about her practice. Moreover, she demonstrated keen understanding about how her research had informed her teaching and its potential to continue to help her develop as a professional. I asked, therefore, whether she intended to build on this small scale study to explore whether it could be helpful beyond the eight SEN children with whom she had developed her hypothesis.
This is where it got odd. I suggested that it would be interesting to involve all of her classes, to divide them into two groups, give one the promising approach that she had piloted with SEN children and continue to teach the others with her usual approach. Then to compare the results. At this an audible intake of breath filled the room, followed by cries of “Ethics!”. “That would be unethical?” two said in unison. “Why?”, I asked. To which the usual trope was trotted out, asserting that denying an apparently promising teaching approach to one set of children, while delivering it to another group, is ethically indefensible.
Rebuttals to this trope are not new, but it is worth going over them again with reference to this project. First, the ‘ethics criers’ appeared not to see the irony of their position, as they celebrated a research project that delivered a promising intervention to only eight of this teacher’s children while denying it to all the others. Neither did they acknowledge the double standard expressed in the idea that a teacher can deliver whatever untested approach she likes to all of her students without ethical approval, but that if she wants to try it out in only half of them, so that she has a better idea of its effects, she is acting unethically. Moreover, by implication, they express the notion that new teaching approaches are only ever positive, without acknowledging the possibility that some new teaching approaches can have negative effects, or can add nothing to what is already being done.
This raises two ethical issues. First, if we are convinced (preferably by good research) that a new teaching approach is categorically better than existing approaches, then the ‘ethics criers’ are correct, we mustn’t wilfully deny it to children who may benefit from it (for example, all the children in this teachers’ classes who were not in her action research group). However, if uncertainty exists about the effects of a new teaching approach (such as the uncertainties expressed by this teacher when she described the limitations of her study), the only ethical course of action is to assess these effects properly. The best way to assess the effects of a new teaching approach is to compare it with an alternative.
So, to add my voice to the conversation about why teachers don’t engage in research: one possibility is that when motivated and clever young teachers are told (by the Practitioner Research SIG of the British Education Research Association of all things) that they are entitled to develop thoughtful educational theories but that they are not entitled to test those theories properly, we are failing to educate them about educational research and we are, therefore, enforcing an embargo on their professional development in this regard. The promise of the Masters in Education Practice to raise research literacy in teachers was fatally compromised by this peculiar attitude to what is and what is not ethical in educational research.
My guest post for NALDIC’s EAL Journal blog
Interesting developments from North Wales. Earlier this year the council decided it would shut the Denbighshire town of Ruthin’s two primary schools and merge them into one new school. One of the schools earmarked for closure is a ‘Category One’, Welsh language school. The other school is a ‘Category Two’ dual stream Welsh/English school, where parents choose either to have their children educated in English or Welsh. Apparently, 80% of parents with children currently in this school choose the Welsh stream. The proposed merged school is also set to be a Category Two school, thus removing the possibility of Welsh only education in the town. Ruthin locals fought the merger, arguing that the Category One school is necessary to protect Welsh language.
This is interesting for a number of reasons, not least because even in an area of relatively high Welsh language use (42% in Ruthin are Welsh speakers) the perceived threat from English is obviously a worry. For me this underscores the difficulty of maintaining diverse language practices in the presence of a very dominant prestige language. If Wales (which has made enormous progress in re-envigorating Welsh language education in the past 30 years or so) is worried, one can only imagine the challenge for other minority languages.
I’d be interested in seeing evidence to help understand what sort of effect the closure of Category One schools in favour of Category Two schools has on Welsh language use. The fear articulated by opponents of the merger is that “the natural dynamic [of Category Two schools] will mean that Welsh-speaking pupils will turn to English”. Denbighshire council, on the other hand says that it believes that the Title Two school will “help generate more Welsh speakers.”
We watch with interest.
For more on Welsh language schooling in Ruthin see this post, which I wrote a while back when the head of an independent school in the area drew the ire of locals by suggesting that Welsh language schools harm the prospects of their children. By my calculations, he had his facts wrong.
Oxford Brookes University offers a Postgraduate Certificate in Education. The fully accredited course focuses on multilingual children learning in mainstream, complementary and international schools. It aims to draw on current debates, policies, practice and research around multilingual learners, to enable participants to:
- Evaluate and critically compare policies connected with the teaching and learning of the EAL/multilingual child
- Identify theories of bilingualism, translanguaging and dynamic language
- Appreciate the links between the learning and the use of different languages
- Explore identity and self-esteem: the emotional experiences of the EAL/multilingual child
- Evaluate teacher, teacher assistant, parent, and whole school responses to the EAL/multilingual child, including the use of technology
- Theorise practice and pedagogy: explore the beliefs, theories and attitudes to language and the EAL/multilingual learner which underpin teacher choices
The course is equivalent to one third of a Master’s degree, and the 60 course credits it attracts can be combined with other modules to build a full MA.
It can be attended online or face to face.
Read more about the course in the attached flyer here.
The Multilingual Learners in Context symposium at Oxford Brookes University on Saturday was an excellent bringing together of academics and educators from related but importantly different fields, under the umbrella discipline of teaching multilingual learners. The order of events allowed a narrative to unfold over the course of the day that revealed common themes which were revisited and enriched as we heard about them from the perspectives of mainstream schooling, community schooling and the international sector. Victoria Murphy and Therese Hopfenzbeck, both of the University of Oxford, bookended the day with discussions of quantitative data that described achievement of multilingual learners in the UK, the character and extent of controlled intervention studies pertaining to EAL learners’ education, and international comparisons of literacy attainment. Murphy ended her talk by impressing on us four key takeaways: 1) more research on EAL in the UK is needed, 2) more funding is necessary for the type and scale of research necessary for us to really know what works, 3) understanding of appropriate pedagogy for EAL students needs to be better integrated into Initial Teacher Education, and 4) We need to abolish the monolingual hegemony. Whether by design or as a result of happy coincidence, these themes would recur throughout the day from very different starting points.
We heard from Ana Souza and Jane Spiro of Oxford Brookes University about the importance of grassroots advocacy, from London to Hawaii, for building and defending provision for appropriate education of multilingual learners. Both Spiro and Souza described the critical role of promotion of all aspects of a multilingual person’s self – their language, their history, their culture – in developing linguistic and inter-cultural competences, and the benefits of raising the visibility of multilingualism generally. As well as the promotion of inter-cultural competence through community schools, Souza described a kind of organic, student initiated, development of meta-linguistic competence as a result of attending these kinds of schools. This was an important acknowledgement of both the social justice aspect of community schooling and its academic utility. Spiro described the Hawaiian situation as a forty-year work in progress, underscoring the time and commitment needed for change to happen.
Segueing perfectly from this thought, Peeter Mehisto described his work (in more countries that I can remember) on bringing together stakeholders to improve the educational lot of multilingual learners. In a serendipitous call back to Murphy’s talk, Mehisto emphasised the need for good research reviews that summarise what we know about educating multilingual learners and how these should be available to all stakeholders. He argued that buy-in from stakeholders is contingent on unbiased understanding of what we know about multilingual education. He also described how crucial the support from the upper echelons of power can be. A sympathetic ear at City Hall from the outset pays dividends when multilingual projects hit the inevitable road bumps caused by forces working against them, either by willful blindness or self-interested power, he said. Mehisto touched on ‘monolingual disadvantage’ in another related call back to Murphy’s four key takeaways, emphasising that multilingual education should not be reserved just for high achievers, members of the ‘elite’, or in order to suppress L1. Among a number of mechanisms necessary to move this agenda forward he proposed shining a light on incompetence – wherever it lurks – as key. This put me in mind of Louis Dembitz Brandeis’ 1913 contention that “Sunlight is said to be the best of disinfectants; electric light the most efficient policeman.”
Drawing on the theme of stakeholders and the importance of what goes on in the homes of multilingual learners, Raymonde Sneddon described work she was engaged in, developing bi-literacy through dual language books and by helping children to create personal texts, inspired by Cummins ‘Identity Texts’. She says this started as an attempt to get languages acknowledged in the classroom, but quickly developed into a deeper exploration into bi-directional transfer of literacy skills and deeper understanding of texts read in two languages. Self-report from parents and children suggested that this work had positive social and academic implications.
Oksana Afitska’s presentation on the use of L1s in promoting learning for multilingual learners used concrete illustrations to emphasise that if a student cannot demonstrate their knowledge using English, this doesn’t mean that they do not possess that knowledge. She showed some helpful examples from SATs that, if considered only in terms of the official marking scheme, would lead one to conclude that the children knew very little about the topics being assessed. Looking through an L1 lens, however, reveals a different picture. Her presentation allowed for some useful discussion about whether there have been any empirical demonstrations of the value of translanguaging as a pedagogic tool. For me, this discussion reinforced the notion that translanguaging has neither been sufficiently well defined, nor has it been sufficiently well assessed as a pedagogic tool for us to say anything particularly conclusive about it.
Cathie Wallace, of the IoE at UCL, framed her talk with the need to provide enriched and expanded text worlds for multilingual learners. She used case studies of pupils in London, describing their engagement with texts on their journeys to becoming literate users of English. She discussed ‘relevance’ and ‘resonance’, citing Arthur Miller’s A View from the Bridge as a text that, while on the face of lacking in relevance, actually was one that resonated very well for some of the young learners with whom she worked. This brought to mind the recent debate about the appropriateness of the texts used in the KS2 SATs. These texts, some have argued, would precipitate success only for ‘white, middle-class girls’, such was their non-relevance to other demographic groups. In the light of Wallace’s differentiation between relevance and resonance, I feel this is a debate worth reviewing.
Therese Hopfenbeck of the University of Oxford described PIRLS (Progress in International Reading Literacy Study), an assessment programme not dissimilar to PISA that is conducted every four years to provide data about how well children in different countries are doing in reading after four years of schooling. Her talk underscored for me the importance of teacher engagement in research for making it relevant, understandable and useful/useable. This is why I feel that we were lucky to have a small number of teachers from local schools present at the symposium, contributing to the discussion and helping to ensure that the end users in all of this were represented.
In sum, the cross-disciplinary nature of the event, the contribution of teachers, advisors, and academics and the broad diet of research approaches made the symposium an excellent educational experience for all in attendance. As a researcher who leans towards intervention studies as a preference, I found hearing first-hand about research which uses other approaches fascinating. Moreover, it helped me to better contextualize the world of research and provision for multilingual learners that I inhabit. This learning was reciprocal, and I was delighted to hear one presenter (more familiar with collection of qualitative data) say how great it was to learn about the data presented in Murphy’s and Hopfenbeck’s presentations as “we never get a chance to see it normally”.
My one regret about the day was the difficulty we had in making local teachers aware of it and its relevance to them, and encouraging them to attend. This is something we will need to work on for next time. However, the symposium was billed as an opportunity to examine the commonalities and peculiarities of differing multilingual contexts, exchange knowledge, and consider ways forward in meeting the needs of multilingual learners, with the potential to radically shape the way we think about and deliver effective provision for our language learners. From my perspective, I’d say mission accomplished.
The event was videoed and, subject to a bit of administration, should be available soon. I will post links when I have them.
*This post has been updated from the original to clean up some typos and rephrase some clumsy passages.
Jeremy Corbyn used all six of his questions at PMQs on Wednesday 20th April to take David Cameron to task over the planned forced academisation of schools. It was a good debate and Corbyn was dogged in his persual of answers to the question of why the government is taking this approach against the wishes of much of the profession and many of Cameron’s own MPs.
It’s worth watching the whole exchange, and you can do so here.
In the course of it, it struck me that Cameron could do with some advice on basic logical fallacies and statistics. So in that spirit I have isolate two mistakes Dave made that he and we should be aware of.
First is base-rate fallacy.
Converter Academies initially were those that were already ‘outstanding’. These were pre approved for conversion, a dispensation that was quickly afforded to ‘good’ schools as well. A House of Commons briefing paper on academies puts it like this:
“[Pre-approval] has been extended to all schools that are deemed as ‘performing well’” p3
If Cameron is suggesting that it was the act of conversion (and what followed) that led these schools to become “either good or outstanding” then he is fooling himself, or trying to fool us.
That same HoC briefing paper goes on to note that:
“Analysis of 2013 exam results appears to show more progress amongst converter academies than all non-academy schools, especially among the very first converters, that became academies in 2009/10. These schools were all rated ‘outstanding’ by Ofsted at the time, so greater progress made in 2013 might be better explained by pre-existing differences rather than the impact of academy status.” p7-8
Moreover, if only schools that were good or outstanding were allowed to convert, what are we to make of the 12% of converter academies that are not now considered good or outstanding?
Which brings us to the second of Dave’s misunderstandings: regression to the mean
Any variable that is extreme on first measurement will tend toward the average (Mean) on its second measurement. Daniel Kahneman describes this statistical phenomenon very well in is book Thinking Fast and Slow. He does so especially clearly in his report of a flight instructor who noticed that every time he bollocked a trainee fighter pilot for an extremely bad manoeuvre, the trainee subsequently improved. By contrast, every time he praised one for an extremely good manoeuvre the trainee subsequently got worse. This instructor concluded that praise was useless and bollocking was an important pedagogical tool. In fact what he was witnessing was regression to the Mean (and nothing to do with the effects of his approach to critical feedback). Extremely good manoeuvres are extremely rare, as are extremely bad ones. The only place to go from an extreme position is back towards average. These pilots were very unlikely to maintain their either awesome or appalling behaviour in the air and would naturally tend back towards more average behaviour the next time they went out in their planes. This made some look worse and others look better.
This phenomenon is just as true for schools. Ignore for the moment the woefully inadequate judge of a school effectiveness that we call Ofsted, and think about the likelihood that a school will be in ‘special measures’. According to the 2013/14 report from the Chief Inspector of Schools, only 3% of schools across all state maintained sectors were judged to be inadequate. This is an extreme value. Any school that is judged as inadequate has only one direction to move in. Inevitably it will move in that direction, regardless of whether it is in the hands of an academy or not.
Perhaps if Dave looked at the other figures for inadequate schools in that document he might be more inclined to learn about statistical analysis of data. As I have said, the national average for schools judged inadequate over all sectors is 3%. The proportion of inadequate academies (taking converter and sponsor-led together) is 10% of primary school academies, 17% of secondary school academies, 5% of pupil referral unit academies, and a whopping 42% of special school academies.
Should Corbyn point out Cameron’s misunderstandings of basic logic and stats next time this discussion comes up? I think he should.