Yes, I know it’s not Thursday, but in my defence, I have been meaning to get to finish writing this for a while and did start writing it on a Thursday. What I want to do is go back and take another look at this study by Ryan et al, published in 2010. Not that I have any particular issues with the study. The issues I have are more to do with how they study has been used in social media and how much weight should be given to the study in the bigger picture.
This was one of the early studies to put a question mark on the “pronation paradigm” of prescribing running shoes and coming out in 2010, it was certainly seized upon by those promoting certain agendas at that time, especially seizing on a comment in the study abstract of: “The findings of this study suggest that our current approach of prescribing in-shoe pronation control systems on the basis of foot type is overly simplistic and potentially injurious.” At the time I certainly did not have any significant issue with the findings in the study and have been including this study in my lectures since the time it was published. I just not sure I wanted to buy into that last part of the conclusion: “potentially injurious”, which was certainly the focus of some loons.
What the study did was recruit 81 female runners who were categorised into three different foot posture types (39 neutral, 30 pronated, 12 highly pronated) based on the foot posture index and randomly assigned a neutral (Nike Pegasus), stability (Nike Structure Triax) or motion control running shoe (Nike Nucleus) and then follow them for 13 weeks:
I won’t get it details, but as can be seen in the table, the data shows that the prescription of running shoes based on the “pronation paradigm” was not supported. The results are not too dissimilar to several studies by Knapik et al, but that was on a military population. More recently, there is the Malisoux et al study that actually provided some support for the “pronation paradigm”. The question then becomes, in the big picture, how much weight do you give to which study? My interest is often at how these different studies are responded to in social media and the lengths some go to, to deconstruct the study that does not suit their agenda and then not apply the same appraisal standards to the study that supports their agenda. Probably the worst example of that I have come across is the criticising of the Malisoux et al study because it was funded by a running shoe company. The hypocrisy is that the above Ryan et al studied was also funded by a running shoe company (Nike), yet that is somehow acceptable. It not hard to figure out why! (See this blog post: Should you trust running shoe research done or funded by Nike?), but I digress.
This is something I been thinking about for a while. The Fragility Index is a test of how robust (or fragile) the results of a clinical trial are. The Fragility Index is a number that indicates how many hypothetical subjects would be needed to convert a trial from being statistically significant to not significant. For example, if a trial was statistically significant in favour of one outcome, how many hypothetical participants would be needed if they all got the other outcome for it not to be statistically significant. So if a study only need one additional hypothetical participant to change the result of statistically significant to not being significant, then it is not a very robust study and the results probably should not be given much weight. However if a study needed, say, 100 additional hypothetical participants to get the other outcome to change the results from significant to not-significant, then that is obviously an incredibly robust study and a lot of weight should be given to the findings. For more on the Fragility Index see here and here (or Google it).
A Hypothetical Fragility Analysis of Ryan et al:
So what about a Fragility Index score of the Ryan et al study to see how robust the results are? Unfortunately and disappointingly, it’s not possible for two reasons: One, the results were not statistically significant in showing differences between the groups, so you can not add hypothetical participants getting a certain outcome to see how many might be needed to make the results statistically insignificant. Secondly, with the current tools to determine a Fragility Index, it can only be done on two groups, whereas the Ryan et al study had three groups.
Having said that, let’s do a thought experiment that might mimic the intentions of why a Fragigilty Index would or could be done. Look at Table 3 from the study embedded above. Look at the number of injuries in each group. There is not many. So let’s assume that the study recruited, say 2 more hypothetical people and lets hypothetically assume that they had a pronated foot and lets hypothetically assume that they got randomised to the neutral shoe and let’s assume that they both hypothetically got an injury. Is that enough to change the lack of statistical significance of the study? I don’t know, it may well do. My point being it would not need to be many more hypothetical participants added to the study get certain outcomes for the results to be different, meaning that hypothetically the results of this study are not that robust. Hopefully, this makes sense.
This does nothing to invalidate the results of the study. It is just a hypothetical thought experiment to guide how much weight should be given to the study in the wider context of studies with differing results.
As always, I go where the evidence takes me until convinced otherwise ….