How can you find out whether a study is any good? This question may sound odd at first because it is often assumed that every study provides new and usable findings. But unfortunately it isn’t that easy: A lot of studies do not provide reliable information.
So it is important to critically evaluate every study. This can be done in a systematic review that analyzes all the available studies on a specific medical issue.
In order to assess whether the results of a study are reliable, you first have to find out why the study was done in the first place and which questions it tried to answer. This may sound trivial, but it is crucial if you want to determine whether the study can actually provide any answers to the original research question. For instance, many studies compare a new medication with a placebo (a fake medication). But if there is already an effective treatment for the medical condition in question, the new medication is usually compared with that. After all, it is important for patients to know which treatment is most likely to work.
When assessing a study, the next step is to see whether the methods used are suitable for answering the research question, whether the study was carried out properly, and whether there were any systematic errors (bias) that could distort the results of the study.
Important questions to ask when assessing the quality of a study:
How were the participants approached and selected? Who was included in the study and who was excluded from the study? For example, people who have several medical conditions at the same time are often excluded from studies. As a result, it might not be possible to apply the study outcomes to patients in the “real world” who have several medical conditions at the same time.
Was the researchers’ description of how they carried out the study detailed and understandable enough for others to be able to repeat the study and verify the results?
Were there enough participants in the study to be able to answer the research question? When treatments are compared, there are nearly always small differences between their outcomes. Scientists then work out the likelihood that these differences could be due to chance rather than being true differences. Here it is important to know exactly how different the outcomes were and how many people participated in the study: The smaller the difference, the more participants are needed in order to consider the difference to be “real.”
Did the study last long enough? To find out whether, for example, a certain weight loss diet is effective, the participants’ weight should be checked again six months or one year after the end of the study – perhaps even after a longer period of time.
How many participants dropped out of the study, and why? How many participants could no longer be monitored after the end of the main part of the study (“lost to follow-up”), and why not? Good studies should include these figures and say whether they influenced the outcomes. This may be the case, for instance, if a lot of people drop out of a study due to bad side effects.
Apart from receiving the different treatments that were being compared in the study, were the groups treated the same otherwise? Differences are especially likely if it wasn't possible to "blind" the doctors or participants properly.
Was it really a fair comparison? It could be a problem, for instance, if a new medication was compared with a standard medication used at a lower dose than usual in daily practice.
Was the success of treatment measured in the same way in both groups? For example, if the results of a blood test were used in one group, but both a blood test and an x-ray were used in the other group, that could change the outcome.
In order to assess the quality of RCTs, the following information is also needed:
How were the groups randomized? Were the participants really randomly assigned to the different groups, or did something influence their selection?
Where possible, did the researchers make sure that neither the study participants nor the doctors knew who was in which treatment group (blinding)?
Did all of the participants stay in the group they were originally assigned to throughout the study? If not, it is no longer possible to make a fair comparison between the groups at the end of the study.
Bertelsmann H, Lerzynski G, Kunz R. Kritische Bewertung von Studien zu therapeutischen Interventionen. In: Kunz R, Ollenschläger G, Raspe H, Jonitz G, Donner-Banzhoff N (eds.). Lehrbuch evidenzbasierte Medizin in Klinik und Praxis. Cologne: Deutscher Ärzte-Verlag; 2007.
Evans I, Thornton H, Chalmers I, Glasziou P. Testing Treatments. German edition: Gerd Antes (ed.). Wo ist der Beweis? Plädoyer für eine evidenzbasierte Medizin. Bern: Huber; 2013. Download.
Greenhalgh T. Einführung in die Evidence-based Medicine: kritische Beurteilung klinischer Studien als Basis einer rationalen Medizin. Bern: Huber; 2003.
Institute for Quality and Efficiency in Health Care. Glossar.
IQWiG health information is written with the aim of helping people understand the advantages and disadvantages of the main treatment options and health care services.
Because IQWiG is a German institute, some of the information provided here is specific to the German health care system. The suitability of any of the described options in an individual case can be determined by talking to a doctor. informedhealth.org can provide support for talks with doctors and other medical professionals, but cannot replace them. We do not offer individual consultations.
Our information is based on the results of good-quality studies. It is written by a team of health care professionals, scientists and editors, and reviewed by external experts. You can find a detailed description of how our health information is produced and updated in our methods.
Comment on this page
What would you like to share with us?
We welcome any feedback and ideas. We will review, but not publish, your ratings and comments. Your information will of course be treated confidentially. Fields marked with an asterisk (*) are required fields.