Friday, July 28, 2023

What's wrong with science?

I think I really need a holiday. 

Many of us are researchers because, in some way or another, we want to make science better. Yet, we rarely keep this goal in mind explicitly when planning a specific project. If we do, how would a research project look like? This seemingly simple question sent me on a downward spiral: What could I do that might really make a difference? Where do I want to make a difference? And why? And what is good science, anyway? Or science, for that matter? What is the purpose of it all? What am I doing with my life?

I spent Thursday afternoon ("Is it Friday yet?") quizzing my new friend, ChatGPT. Although ChatGPT was reluctant to answer the question "What am I doing with my life?", we had some interesting discussion about science and everything that's wrong with it. Setting aside existential angst, the three relevant questions are: (1) What is (good) science? (2) Which are some aspects where we still need improvement? (3) In the current discussions on how to improve science, how do the proposed solutions that are on the table relate to the aspects that need improvement? 

To summarise ChatGPT's response to the first question (phrased as "What is the aim of science?"): There is a list of eight goals:

  1. Explanation
  2. Prediction 
  3. Understanding causality
  4. Falsifiability
  5. Reproducibility
  6. Continuous improvement (self-correction)
  7. Application and innovation
  8. Unification of knowledge.

Some of these points may be contentious (is prediction without explanation really science?), but overall, it sounds at least reasonable. 

As a next step, I asked the less nuanced question: "What's wrong with science?" Again, ChatGPT provided a list of eight items:

  1. Reproducibility crisis
  2. Publication bias
  3. p-hacking and cherry picking
  4. Funding and conflicts of interest
  5. Lack of diversity and inclusivity
  6. Ethical concerns
  7. Hypercompetitiveness and pressure to publish
  8. Miscommunication and sensationalism.

So it seems that my social media bubble is representative of a broader population, or in any case, of ChatGPT's training data. All of these are important challenges that need to be addressed. For an ambitious researcher trying to make the world a better place, the question remains: What are still some gaps that might not have been addressed yet? 

Broadly, the aims of science according to ChatGPT can be divided into methodological/technical and theoretical aspects. Reproducibility, self-correction, and application and innovation fall into the former category. There are clearly things that are wrong on this technical level: The reproducibility crisis, publication bias, p-hacking, funding and conflict of interest, pressure to publish and miscommunication all relate to this. To put it bluntly: Given these issues, when one reads about a given finding, one is simply not sure whether this finding can be trusted or not. Without a doubt, this is the first general problem that needs to be tackled: I'm a firm believer of never trying to explain something unless one is sure that there is something to be explained (see my first ever blogpost).

Having replicable, reproducible, robust, and generalisable effects is still a far cry away of achieving the more theoretical aims of science. Sure, knowing that two variables correlate is useful for prediction, but just knowing that this correlation exists tells us nothing about the explanation or causality, nor does it allow for a unification of knowledge. A lack of diversity and inclusivity prevents us especially from achieving the goal of unification of knowledge, because it excludes many varying perspectives from the scientific discourse. Ethical concerns are an issue on the more basic level - these should be considered even before asking questions about methodological or technical aspects of a study. This still leaves us with a gap, though, between having a robust finding and making sense of it.

Of course, linking results to theory is not a novel question. Just in the last few months, I've come across this preprint by Lucy D'Agostino McGowan et al, and this blogpost by Richard McElreath. Still, in seeing how we do science in real life, I see room for improvement on this front. It's relatively easy to provide easy-to-follow rules for showing that your finding is credible (or, at least easy-to-follow-in-principle, if you have unlimited resources). It's more difficult for the less tangible question of linking your finding to an explanation.

The good news is: My summer holiday is starting next week. The bad news is: I'll probably spend it pondering and researching all of these questions.

Saturday, July 8, 2023

A rant on bilingual books for toddlers

My toddler is growing up in a linguistic environment that I call multilingual and some people tell us is "confusing". Growing up with multiple languages is, of course, the norm in many places, and has the positive side effect of children being able to speak more than one language by the time they grow up.

Given my convictions about the benefits of growing up multilingually, I was excited to find bilingual books for toddlers: Picture books with picture names or stories written in two languages. Identical books exist for different languages that are commonly spoken in Germany, such as German/Turkish or German/Arabic. I ordered some books in German/Russian. I would like to note that they were absolutely not cheap, but I love books, I love languages, and I love my toddler, so I figured it's worth it. 

I was disappointed when the books arrived. They were just so obviously written in German and probably google-translated into Russian. There are some blatant grammatical mistakes (it should be "нет", not "не"):

I'm pretty sure this is not even Russian*:

 And things that are linguistically awkward:

(Grandmother, grandfather, and ... grandmother and grandfather.)

Can I blame these books for being, basically, really bad teachers of the Russian language? I guess not, if we consider the probable reason why these publishers publish the books: To teach children with a migrant background better German and integrate them into German society, rather than to support their knowledge of a foreign language. 

And yet, isn't it also very important, for an individual, to have the possibility to retain ties to their culture of origin and to their family members who might not speak German? And for the society, isn't it important to have individuals who speak multiple languages, especially such languages that are spoken by a substantial number of people in Germany? 

And would it really be so difficult to find a Russian speaker in Germany who would be able to write a good translation?

* Edit: So it seems I didn't do enough research before publishing the blog post: "Витать в облаках" is, indeed, a Russian expression that I didn't know ( Just shows the potential that these books have: to also teach new expressions to mummies and daddies.


 I don't want to name-and-shame (because the general idea behind the books is amazing, and I would like to thank both the publishers and authors for taking the step of publishing such books!), but my academic background dictates that I name the sources from which I took the pictures above:

"Wie schön!/Как здорово!" by Petra Girrbach/Schmidt & Cornelia Ries, publisher Bi:Libri

"Bildwörterbuch für Kinder und Eltern Russisch Deutsch", no author listed, publisher Igor Jourist

Monday, April 17, 2023

On graphomania

As we're moving flats soon, I threw out a pile of papers, about half a meter tall when all stacked up. I've accumulated these papers over the last 6 years that we've been living in our current flat. They include different types of papers:

1) Papers that I started reading but then realised they weren't as relevant or interesting as I thought.

2) Papers that I printed because the title and abstract sounded (and still sound) fascinating - but as I haven't read them while they've been lying around for years, I should give up on my wishful thinking and acknowledge the fact that I will probably never have the time to read them.

3) Papers that I've read but mostly forgotten about.

If it sounds discouraging that, as part of our academic jobs, we don't really have the time to read papers, it gets worse when you consider the implication that our very own papers are probably getting treated in the same way. Indeed, I have found that I myself am starting to forget what I wrote in various papers where I'm the first author. For example, I spent hours writing a discussion section for a paper I'd started writing months previously, only to discover that past me had already incorporated most of my arguments and examples in the introduction section! 

Of course, this is not a new problem, and I'm not the first one to talk about it. Dorothy Bishop wrote a more detailed blogpost with more than anecdotal observations here: Here, she basically showed that a researcher studying autism and ADHD would need to read about 8 papers a day to keep up with all the new literature in the field (assuming they're already up-to-date with all papers that have previously been published). 

The reason why I'm writing so much is also obvious. I need publications so that I get a job and so that my department gets money. And yet, as much as I love writing, and more generally, working as a researcher, I wonder if there isn't a better way to spend my time, and hereby the taxpayers' money that is paying for my time...

In the meantime, I'll try to practice the art of minimalist writing.

Tuesday, January 3, 2023

New Year's Resolutions of an Early-Mid-Career Researcher in Germany

Three years ago (before COVID and the birth of my now toddler, which have put my academic life on hold in some ways), I wrote a New Year's post summarising my year and my new year's resolutions. Though I see it as a kind of superstition, I still like to take this time of the year to think about my achievements so far, and about what I need to do next to get where I want to get (and, of course, about where I want to get in the first place). In some years, it's easy: it is clear what I need to focus on. In other years, it's hard: Either there are too many things to focus on, or I decide that, actually, everything is going well, and I don't need to change anything. This year, it's hard in a different sense: It's not really clear what I can do to get any closer to my goals. 

My current position is not untypical for an early-to-mid-career researcher in Germany. In some ways, it is clear where I need to get to. The goal for most researchers here is a professorship. The timing is clear, too: there is a limit on the number of years one can work as a postdoc (a controversial German law, with a beautiful compound word for a name: Wissenschaftszeitvertragsgesetz). This means that I need to get a professorship (or other permanent academic position) within ca. 2 years, or else leave academia. Getting a permanent position would be good in any case, when trying to lead a stable family life and after having taken out a mortgage for a flat. Professorship positions are very competitive, especially if you are not too flexible with moving to a different city and even more so if the city where you would like to stay is Munich. 

With the high competition, finding a way directly to a professorship (i.e., applying for a professorship position and getting it) is very unlikely. The professorship application process is rather intimidating, and relies a lot on insider knowledge from other academics ("hidden curriculum"). The procedure is often not very transparent, so it is difficult to know just how far I am from getting shortlisted or even selected as the winner. The alternative is to try some other things to increase the probability of getting a professorship. This includes applying for prestigious grants or publishing high-profile papers. At some stage, my university guaranteed a professorship to any winner of an ERC Starting Grant, but they have now cancelled this policy. Some funding bodies allow one to apply for financing for a professorship position, but this requires the university to commit to paying the new professor's salary after the end of the funding period. In any case, applying for prestigious grants in itself is very competitive, so to increase the chances here, one needs to apply for less competitive grants and publish papers. In short, one just has to repeatedly try various things that cost a lot of resources and have a relatively low chance of success. This does not lend itself as a good new year's resolution, because there is no single action that I could commit to doing, either as a one-off or as a repeated activity.

Of course, my ambitions are not simply to get a professorship for the sake of getting a professorship, but primarily I would like to continue with my research agenda, and getting a professorship is one of the not-so-many ways to do this. Having a stable job to build up my research team is a necessary condition for doing good research, but it's not sufficient. There are skills that I still need to improve to keep up-to-date with the best research practices. Picking a skill to improve would be a good new year's resolution, but it may not help me to get any closer to a professorship position. Such skills could be learning a new language or improving my programming skills, for example, by learning more about Natural Language Processing. If I pick one such skill to focus on in 2023, I may find that I'll have to abandon it, because it will be more advantageous, in the short-term, to focus on writing a paper or grant proposal. On top of that, I also somehow keep my head above water with student supervision, family life (which I will not compromise on), and bureaucratic duties (unlike the former two duties, something that I don't enjoy doing at all but that keeps increasing as I progress in my academic career). Keeping my head above water could be a good new year's resolution, but - well - it sounds a bit depressing.

With what I have written above, some (myself included) might wonder if my ambitions are too high. In the German system, an academic career is almost an all-or-none affair (leaving academia vs. becoming a full professor, who, in Germany, have a lot more freedom and power than professors in many other countries). There are options in between a professorship in Munich and leaving academia, though. These include: applying for professorships at universities outside of Munich (which would be an inconvenience, but not a disaster for my family life), though these are also very competitive. There are non-university tertiary education institutions which hire professors, but I've heard that there is such a high teaching load that, in practice, there is just no time for research. There might be research positions outside of universities that could interest me, though I haven't found anything convincing yet. Maybe I should make it my new year's resolution to decide what I really want, and whether my ambitions are realistic. But this kind of decision is likely to change a lot, with incoming information, such as future successes or failures, and is unlikely to be completed by the end of the year. 

In the end, I think I'll just stick to eating more vegetables as my new year's resolution for 2023.

Thursday, December 1, 2022

Should I stay or should I go? Some thoughts on switching from Twitter to Mastodon

I have not been active on social media for a while, and I must admit that keeping off social media did not feel like a big loss. Lately, I have started checking in more frequently again, though. Ironically, the reason for this is my consideration of whether I should delete my Twitter account or not. 

I will not pretend like I know exactly what's going on with the whole Elon-Musk-buying-Twitter thing and all the pros and cons of staying on Twitter. However, I like the general idea to move from relying on large corporations to smaller providers, and so I welcomed the idea to try out Mastodon as an alternative, and created an account. I'm pretty happy so far, which leads to the question: Should I keep my Twitter account?

My considerations here are mostly pragmatic. I have benefited a lot from being on Twitter. I managed to catch the wave when the Open Science movement started, and through Twitter, I have become an active member of the Open Science community. I have met and discussed with many colleagues about what I refer to as my "day job", my research on reading and dyslexia. My impression is that most Open Science people have moved to Mastodon, so I should not miss out on any new developments here if I delete my Twitter account. The same does not seem to be true for the reading research community, however. Perhaps (ironically), they are more interested in retaining their outreach to a broader audience than the open science community, and are skeptical about this being possible on Mastodon. 

This leaves me with the following dilemma: As a pro of keeping my Twitter account, I will be able to keep in touch with the reading research community. As a con, I will have yet another social media account, and it's not like I have so much spare time that I can keep up with yet another timeline. 

There are some other things I could do. For example, I could revive my lurker account on Twitter, which I created when the atmosphere in the Open Science community got a bit too tense for my liking and where I follow exclusively reading researchers, and delete my main account. Or, I could leave social media altogether. 

There is no conclusion to this post, it's just some disorganised thoughts, quickly jotted down between two meetings. Maybe it will encourage some reading researchers to try out Mastodon? And, in case you notice that I disappear from Twitter, I hope to stay in touch with all of the amazing people that I have met throughout the years.

Friday, October 16, 2020

Anecdotes showing the flaws of the current peer review system: Case 2

Earlier this week, I published a Twitter poll with a question relating to peer review. Here is the poll, as well as the results of 91 voters (see here for a link to the poll and to the responses):

The issue is one that I touched in my last blogpost: When we think that a paper that we are reviewing should be rejected, should we make this opinion clear to the editor, or is it simply our role to list the shortcomings and leave it up to the editor to decide whether they are serious enough to warrant a rejection? 

Most respondents would agree to re-review a paper that they think should not be published, but add a very clear statement to the editor about this opinion. This corresponds to the view of the reviewer as a gate keeper, whose task it is to make sure that bad papers don't get published. About half as many respondents would agree to review again with an open mind, and to accept it if, eventually, the authors improve the paper sufficiently to warrant publication. This response reflects the view of a reviewer as a guide, who provides constructive criticism that will help the authors produce a better manuscript. About equally common was the response of declining to re-review in the first place. This reflects the view that it's ultimately not the reviewers' decision whether the paper should be published, but the editor's. The reviewers list the pros and cons, and if the concerns remain unaddressed and the editor still passes it on to the reviewers, clearly the editor doesn't think these concerns are major obstacles to a publication. The problem with this approach is that it creates a loophole for a really bad paper: if the editor keeps inviting re-submissions and critical reviewers only provide one round of peer review, it is only matter of time until the lottery results in only non-critical reviewers who are happy to wave the paper through. 

The view that it's the reviewer's role to provide pros and cons, and the editor's role to decide what to do with them, is the one that I held for a while, and which led me to decline a few invitations to re-review that, in retrospect, I regret. One of these I described in my last blogpost, linked above. Today, I'll describe the second case study. 

I don't want to attack anyone personally, so I made sure to describe the paper from my last blogpost in as little detail as possible. Here, I'd like describe some more details, because the paper is on a controversial theory which has practical implications, some strong believers, and in my view, close-to-no supporting evidence. Publications which make it look like the evidence is stronger than it actually is can potentially cause damage, both to other researchers, who invest their resources on following up on an illusory effect, and for the general public, who may trust a potential treatment that is not backed up by evidence. The topic is - unsurprisingly for anyone who has read my recent publications (e.g., here and here) - statistical learning and dyslexia. 

A while ago, I was asked to review a paper that compared a group of children with dyslexia and a group of children without dyslexia on statistical learning, among with some other cognitive tasks. They showed a huge group difference, and I started to think that maybe I was wrong with my whole skepticism thing. Still, I asked for the raw data, as I do routinely; the authors argued against this with privacy concerns, but added scatterplots of their data instead. At this stage, after two rounds of peer review, I noticed something very strange: There was absolutely no overlap in the statistical learning scores between children with dyslexia and children without dyslexia. After having checked with a stats-savvy friend, I wrote the following point (this is an excerpt from the whole review, with only the relevant information): 

"I have noticed something unusual about the data, after inspecting the scatterplots (Figure 2). The scatterplots show the distribution of scores for reading, writing, orthographic awareness and statistical learning, separated by condition (dyslexic versus control). It seems that in the orthographic awareness and statistical learning tasks, there is no overlap between the two groups. I find this highly unlikely: Even if there is a group difference in the population, it would be strange not to find any child without dyslexia who isn’t worse than any child with dyslexia. If we were to randomly pick 23 men and 23 women, we would be very surprised if all women were shorter than all men – and the effects we find in psychology are generally much smaller than the sex difference in heights. Closer to home, White et al. (2006) report a multiple case study, where they tested phonological awareness, among other tasks, in 23 children with dyslexia and 22 controls. Their Figure 1 shows some overlap between the two groups of participants – and, unlike the statistical learning deficit, a phonological deficit has been consistently shown in dozens of studies since the 1980s, suggesting that the population effect size should be far greater for the phonological deficit compared to any statistical learning deficit. In the current study, it even seems that there was some overlap between scores in the reading and writing tasks across groups, which would suggest that a statistical learning task is more closely related to a diagnosis of dyslexia than reading and writing ability. In short, the data unfortunately do not pass a sanity check. I can see two reasons for this: (1) Either, there is a coding error (the most likely explanation I can think of would be some mistake in using the “sort” function in excel), or (2) by chance, the authors obtained an outlier set of data, where indeed all controls performed better than all children with dyslexia on a statistical learning task. I strongly suggest that the authors double check that the data is reported correctly. If this is the case, the unusual pattern should be addressed in the discussion section. If the authors obtained an outlier set of data, the implication is that they are very likely to report a Magnitude Error (see Gelman & Carlin, 2014): The obtained effect size is likely to be much larger than the real population effect size, meaning that future studies using the same methods are likely to give much smaller effect sizes. This should be clearly stated as a limitation and direction for future research."

Months later, I was invited to re-review the paper. The editor, in the invitation letter, wrote that the authors had collected more data and analysed it together with this already existing dataset. This, of course, is not an appropriate course of action, assuming I was right with my sorting function hypothesis (no matter what, to me that still seems like the most plausible benign explanation): analysing a probably non-real and definitely strongly biased dataset with some additional real data points still leads to a very biased final result.

After some hesitation, I declined, with the justification that the editor and other reviewers should decide whether they think that my concerns were justified. Now, again months later, this article has been published, and frequently shows up in my researchgate feed, with recommendations from colleagues who, I feel, would not endorse it if they knew its peer review history. The scatterplots in the published paper show the combined dataset: indeed, among the newly collected data, there is a lot of overlap in statistical learning between the two groups, which adds noise to the unrealistically and suspiciously neat plots from the original dataset. This means that a cynical person looking at this scatterplot is unlikely to come to the same conclusion as I did. To be fair, I did not read the final version of the paper beyond looking at the plots: perhaps the authors honestly describe the very strange pattern that's probably fake in their original dataset, or provide an amazingly logical and obvious reason for this data pattern that I did not think of.

This anecdote demonstrates my own failure in acting as a gatekeeper who prevents articles that should not be published from making it into the peer-reviewed body of literature. The moral for myself is that, from now on, I will agree to re-review papers I've reviewed previously (unless there are some timing constraints that prevent me from doing so), and I will be more clear when my recommendation is not to publish the paper, ever. (In my reviewing experience so far, this happens extremely rarely, but I have learned that it does happen, and not only in this single case.) 

As for my last blogpost, I will conclude with some broader questions and vague suggestions about the publication system in general. Some open questions: Are reviewers obliged to do their best to keep a bad paper out of the peer-reviewed literature? Should we blame them if they decline to re-review a paper instead of making sure that some serious concern of theirs has been addressed (and, if so, what about those who decline for a legitimate reason, such as health reasons or leaving academia)? Or is it the editor's responsibility to ensure that all critical points raised by any of the reviewers are addressed before publication? If so, how should this be implemented in practice? Even as a reviewer, I sometimes find that, during the time that passes between having written a review and seeing the revised version, I forgot all about the issues that I'd raised previously. For the editors, remembering all reviewers' points when they probably handle more manuscripts than an average reviewer might be too much to ask. 

And as a vague suggestion: To some extent, this issue would be addressed by publishing the reviews along with the paper. This practice wouldn't need to add weight to the manuscript: on the article page, there would simply be an option to download the reviews, next to the option to download any supplementary materials such as the raw data. This is already done, to some extent, by some journals, such as Collabra: Psychology. However, the authors need to agree to this, which for a case such as the one I described above seems very unlikely. To really address the issue, publishing the reviews (whether with or without the reviewers' identities) would need to be compulsory. This would come with the possibility of collateral damage to authors if a reviewer throws around wild and unjustified accusations. Post-publication peer review, such as is done on PubPeer, would not fully address this particular issue. First, it comes with the same danger of unjustified criticism potentially damaging honest authors' reputation. Second, ultimately, a skeptical reviewer who doesn't follow the paper until the issues are resolved or the paper is rejected, helps the authors to hide these issues, such that another skeptical reader will not be able to spot them so easily without knowing about the peer review history.

Thursday, October 8, 2020

Anecdotes showing the flaws of the current peer review system: Case 1

A friend, who had decided not to pursue a PhD and an academic career after finishing his Masters degree, asked me how it's possible that so many of the papers that are published in peer-reviewed journals are - well - bullshit. As a response, I told him about a recent experience of mine. 

A while ago, I was asked to review a paper by a journal with a pretty high impact factor. I agreed: the paper was right in my area of expertise and sounded very interesting. When I read the manuscript, however, I was less enthusiastic. Let's say: I've seen better papers desk-rejected by lower impact factor journals. This was a sloppily designed study with overstated conclusions. I wrote the review by my standard template: First, summarise the paper in a few sentences, then write something nice about it, then list major and minor points, with suggestions that would address them whenever possible. I hold on to the belief that any study that the authors thought was worth conducting is also worth publishing, at least in some form. In the paper, I detected a potential major confound, and I had the impression that the authors wanted to hide some of the information relating to it, so I asked for clarifications. 

I submitted my review, and as always, a while later, received the decision letter. The other reviews were also lukewarm at best, so I was very surprised that the action editor invited a revision! When the authors resubmitted the paper, I agreed to review it again. However, most of my comments remained unaddressed, and my overall impression was that of the authors trying to hide some of the design flaws to blow up the importance of the conclusions. I wrote a slightly less reserved review, stating more clearly that I didn't think the paper should be published unless the authors addressed my comments. When I was invited to participate in the third round of reviews, I declined: I just didn't want to deal with it. 

Several months later, I saw the paper published in the very same high impact factor journal. As the academic world is small, I now knew for sure what I had suspected despite the anonymity of the peer review process: the senior author of that paper was a friend of the action editor's.

This is, of course, an anecdote, coloured by my own perceptions and preconceptions. There is nothing to suggest, other than my own impression, that the paper was published only because of the friendship between the author and editor. Maybe (probably) I'm way too skeptical in my reading of articles. That was also one of the reasons why I had declined to do a third round of review: I wanted to leave it up to the editor and the other reviewers to decide whether my concerns were justified. But let's be honest: Is anyone truly surprised that there are some cases where editors are more lenient when they personally know the author(s)? And, if we are truly honest, isn't this just a very natural thing that we do ourselves whenever we judge our colleagues' papers, be it as reviewers or editors or simply as readers: letting people we know and like get away with things that we would judge strangers harshly for? 

Maybe this anecdote, along with your own personal experiences, is convincing enough to show that at least sometimes, personal interest interferes with objective judgements and allows articles to pass peer review when they wouldn't hold up to scrutiny under other circumstances. This raises two questions, to which I don't have an answer: How often does this happen, and is this really a problem? And, more importantly, what is a better system? 

For years, I've been an advocate for as much transparency as possible in all aspects of the research process, and in line with this principle, I started signing my reviews shortly after I finished my PhD (though I stopped signing them later). Now, I am coming to the conclusion that anonymity has substantial advantages, not only if the reviewers don't know the identity of the authors, but also if the editors don't know the identity of the authors. Would this help? Well, maybe not. Years ago, I've been told by a senior researcher that it doesn't matter whether peer review is anonymous or not, because it's normally obvious who exactly - or at least which lab - produced the paper. In my experience (I've reviewed ca. 60 papers since then), I'd say this is often true, and when I review an anonymous paper I cannot stop myself from taking a guess at who the authors are.

So, to conclude, I don't have the answers to the two questions I asked above. But I do know that experiencing such anecdotes leaves me discouraged and frustrated about a system where one's chances of being employed are determined based on whether one publishes in high impact factor journals or not.