Saturday, March 21, 2020

An introvert's guide to surviving in isolation

On social media, extroverts are jokingly asking introverts for advice on how they survive being by themselves for long periods of time. Being extremely introverted, this is indeed a question that I don't hear that often. And while I realise that it's often being asked as a joke or a rhetorical question, I'm going to do the socially awkward thing (being an introvert) and give a long, serious answer to that.

Generally, I like being alone, and find ways to entertain myself. However, even for me there have been periods in my life when there was too much loneliness. In writing this blogpost, I'm keeping in mind the last of such periods: During my post-doc in Italy, I found out that, during the month of August, everyone is on holidays. The university was shut, with heavy chains blocking the main entrance to the building, and when I snuck in through the back door, I found the building completely empty, so I started working (or not working) from home. I didn't have many friends (being an introvert and finding it difficult to meet new people), and those I had were travelling themselves. I also had my own flat, and no housemates to socialise with. So I spent several weeks by myself, hardly talking to anyone. I'd like to list some things that might help others who are in a similar situation, of being at home by themselves, perhaps working from home, or living in quarantine for a few weeks.

The situation was, of course, different from the Corona-situation now. I could travel, and scheduled in regular day trips to nearby cities and towns and one or two weekend trips in further-away cities. There was no pandemic, no external reason for fear and anxiety. However, it was tough (even for me), and thinking back, I remember a few things that worked in terms of helping me to get through this period, and a couple of things that probably made me feel worse at the time. These are specific to me: an introvert who is impatient and has nerdy hobbies. I don't want to make it sound like the things that helped me will help everyone. But maybe some of my more extroverted friends will find some things that they can try out if they feel down during their time at home.

Things to do
1) Keep a routine. Eat regular meals, wake up at a reasonable time, get dressed, brush your teeth, go for walks, exercise, take showers.
2) Buy a pot plant. Even if you don't have a green thumb, a routined lifestyle and too much time on your hands will allow you to look after it well. If you live by yourself, it's a way to get something that's alive and low maintenance into your life, and you'll be happy when the plant grows and starts blossoming.
3) Treat yourself to good meals on a regular basis. This was very easy for me, because I was living in Italy: I could taste cheeses and meats from the market, buy lots of fresh fruit and vegetables, and try out new recipes from giallozafferano.it/. Preparing and eating a nice dinner gave me something to look forward to each day.
4) Go outside every day. At this stage, in Bavaria, it is allowed to go outside for walks. Try to discover some new streets or parks near where you live. Find a nice place to watch the sunset, and make it part of your routine to watch it every day.
5) Vary the activities. To an introvert, there are many fun things that can be done at home. To name a few: Reading, learning a language, writing a blog post, short story, or working on a novel (don't worry, you don't have to ever show it to anyone - it's just a way to keep yourself busy and your brain active), playing the piano, listening to music, learning to code, watching movies, series, or youtube videos, cooking. Go through a list of things you've always wanted to do: perhaps you have a Spanish language book on your shelf from when you wanted to learn Spanish but then found out you didn't actually have the time, or your friends kept telling you about this awesome novel that you always forgot to download to your e-reader, or a guitar you bought ages ago and haven't touched since.

Things to avoid
1) Depressing things. There are times to read or watch movies about war, death, and destruction. But when in isolation, it's not a good time to expose yourself to things that drag you down. At this stage, this also involves my social media feeds and the news. It's important to keep up-to-date with what's happening, but it's not good to become completely absorbed by it. Minimise the amount of time you spend doing things that you know will make you feel depressed: maybe just check the news three times a day.
2) Drinking too much. In line with Point 3 from "things to do", it's nice to have a glass of some nice wine occasionally. However, then it becomes tempting to have another glass of the delicious wine, and then another, and then another, and before you know it, you're feeling drunk, lonely, and terrible.
3) Binging on anything. For me, doing anything for too long makes me feel like my head is about to implode, and at the end of the day, it feels like it's been wasted. This involves spending the whole day binge-watching series, but also finding a book that is so interesting that I can't put it down and end up forgetting my whole routine, including sleep, until I get to the last page.
4) Building a den: It's nice to have a cozy place, for example, your favourite blanket and pillows on the couch. But it's not good to spend too much time in it. This relates back to the previous point: Spending a whole day binge watching or reading something in your den, eating in your den, drinking in your den, is not a day well-spent for me.

I focussed on things that an introvert would say, but, of course, there are other things that are advisable to do if you're in isolation and feel lonely. Actively seek contact: call or email an old friend and ask them if they're OK. Video-chat with your relatives. Check if your elderly neighbours need something from the shop. And, most of all: Stay safe, and take care of yourself!

Tuesday, March 10, 2020

Ten reasons to leave academia

I've been in academia for 9 years now (counting from the beginning my PhD), and I don't actually hate my job most of the time. But sometimes I do, and then I wonder why I'm continuing in academia rather than finding a more stable, more well-paid job. There are different things that make me wonder about that on a regular basis.

Why write a blogpost about those things? As a selfish reason, I hope that venting will make me feel better. As an altruistic reason, I think it's important, for any aspiring academic, to be aware of the problems associated with this career path. I'm not sure if striving towards an academic career is a rational thing to do. Either way, I'd recommend to anyone starting on this path to consider alternatives, and to make sure they acquire skills that will help them on the non-academic job market.

The list that I made describes my experiences, and most of them do not universally apply to academia across the board. These problems may not exist in other countries, universities, or fields of study, and conversely, there will be other things that other people hate about working in academia that I don't notice in my everyday life. Also, I've never worked outside of academia (other than a few student jobs), so I don't know whether the things that sometimes drive me insane in academia are better elsewhere. In short, any reader is encouraged to decide for themselves whether these things apply to their own academic system, and whether they would be bothered by these issues.

So, here is my list:

1) Lack of stability: Starting an academic career involves, first, doing a PhD, and then continuing with a post-doc. What happens afterwards depends on the country. In some countries it's relatively easy to get a lecturer's position after one or two post-docs (or even straight out of the PhD). In others, you do one post-doc after the other, only to end up with no professorship or tenure, no more post-doc funding, and being too old to get a good job outside of academia. In some countries, it's perfectly normal to meet post-docs who are in their late 40s. A post-doc position is fun and all when you're out of your PhD. However, at some stage, every extension of a post-doc contract feels like both a blessing and a curse, with no guarantee that it will be extended again and the knowledge that you're postponing the problem of finding a stable job for another 2 years, when it might already be more difficult to change your career path.

Often, it's expected that researchers change labs and countries every few years. This is an aspect of academia which I enjoyed very much - until I met my husband. As soon as one wants to settle down, one might find it impossible to get a permanent academic job (or any kind of academic job, for that matter) in the same place as the partner.

2) Bad long-term perspectives: There are more PhD students than post-doc positions, and way more post-doc positions than professorships. Statistically, this means that getting from the beginning of an academic career all the way to a professorship position is not very likely. The extent to which this is true depends on the country where you live. In Germany, the permanent positions are limited to professorships, which means that unless you get a professorship, there is no stability, and you probably won't get a professorship. 

3) Writing grants: If you want the opportunity to work on your own project, you need to get a grant. If you don't already have a position, you can apply for a grant that pays your salary. Some grants are so prestigious that they come with a guarantee of a permanent position for after the project is finished, such as the ERC Starting Grant. If you manage to get this kind of grant, Problems 1 and 2 from above are solved. The catch is: they're very competitive. The probability of success depends on the quality of your proposal, your track record, political issues (whom do you know? whom does your supervisor know?), and luck. The ratio of these four things is unknown (to me). However, luck is a big factor. Don't believe me? Write a mock proposal, and ask 10 different colleagues to give you feedback. Count the number of pieces of advice that you get that are in direct conflict with each other. Your reviewers could share any one of these different opinions: Which piece of advice do you incorporate, and which one do you ignore? Aside from the huge role that luck plays in getting grants: The high competitiveness means that, in the most likely scenario, you'll spend a lot of time writing a proposal that will bring you absolutely no benefit.

4) Publish or perish: For all three of the above points, your success depends a lot on where you publish. Ideally, you publish lots of papers in high impact-factor (IF) journals that get cited a lot and featured in the media. A high IF does not correspond to higher quality: if anything, the correlation between IF and different markers of quality is negative. Still, publishing in high IF journals is necessary not only for your own career, but also for your department: at our faculty, the income of the department is calculated through a system called "Leistungsorientierte Mittelvergabe" (LOM for short, translating to something like "achievement-based distribution of funds"). I am told that each additional IF point translates to roughly an additional 1,000 Euros that the department will get. This LOM also takes into account the amount of grant money you bring in and, to a lesser extent, the amount of teaching. If you don't publish a lot and don't get any grants, you're pretty much useless to your department.

Whether or not you manage to publish in a high-IF journal is again, to some extent, a matter of luck and connections. Recently, I was asked to review a paper for a relatively high-IF journal. This paper was simply not good: sloppy design, messy results. The other reviewers were also not overly enthusiastic. Yet, though in my experience such reviews normally lead to rejection, even in lesser-IF journals, the editor's verdict was "major revision". After a couple of rounds of review, I decided that my comments didn't make much of a difference, anyway, and declined to review the next version. The paper was, of course, eventually published, and I realised why the editor had supported it, despite its so-so quality: there were a couple of big names among the authors. Here's a looming theme that applies to all of the points to some extent: Success in academia is a lot about having connections with big names, which, of course, leads to systematic discrimination of anyone who doesn't have the means to establish such connections.

5) Doing things for free: An advantage of academia is that a lot of people do it because they love it. And when people work on something that they love and truly believe in, they are willing to invest more than what they are paid for. And when people do things for free, there is the expectation that they will continue to do so. This means that you will be at a disadvantage if you don't do things for free. This leads to the next point: any nice thing that you do is no longer seen as a favour from your side, but is taken for granted, leading to a...

6) Lack of appreciation: Another advantage of academia is that you can work on a project that is very important to you. The problem is: everyone feels this way about their own project. You might be hoping for some occasional ego boost from your collaborators, colleagues, or reviewers. However, everyone else is busy with their own awesome projects and probably doesn't even have the time to read your latest paper.

7) Lack of communication: Ideally, academia should be a place of intellectual exchange. This is one of the parts that I love about my job: going to conferences and talks and learning about something new, discussing ideas with colleagues working on something completely different, and realising that you can use their approach in your own work. This hardly ever happens in real life: everyone is too busy with their own project. Any attempts to organise a reading group or after work drinks are greeted with great enthusiasm, and when the meetings actually start, the number of participants dwindles quickly from a handful of people to zero.

8) Bureaucracy: I've worked in three different countries, and in each of them, bureaucracy consumed too much time and energy. This seems to happen in different ways in different countries. In Australia, many decisions that, in other places, are made by academics, are made by the administration, which leads to solutions that are not necessarily helpful to create a good research environment (even when they are supposed to be). In Italy, bureaucracy is characterised by a lack of transparency, and in Germany, by a lack of flexibility. For example, I do some studies on reading ability in children. The easiest way to get participants would be to go to schools and test those children who have parental consent there: except the bureaucracy is so time-consuming that my colleagues advised me to not even try. Everyone spending any time at our department needs to have a medical certificate and up-to-date vaccines, as well as a police check, even if they have no contact with patients or participants. Of course, nobody knows what to do if a visiting researcher can't easily get a police check from their country of origin. For external PhD supervisors, a habilitation is required. I tried and failed to explain to an administration officer that habilitation is not a thing outside of Europe. In fact, it took months to convince the university administration to pay me a post-doc salary, because a formal requirement for getting a post-doc salary is that one has a masters degree, and as I have a Bachelor with Honours degree and a PhD, I was considered under(?)qualified. 

Bureaucracy is, of course, not specific to academia, and also makes everyday life difficult (if you're ever keen to hear a long and not-that-interesting story, ask me about getting my drivers license changed from an Australian to a German one). However, I imagine that companies which aim at making profit cut out a lot of bureaucracy that costs time and doesn't bring benefits (and is often directly damaging).

9) Salary: This is the last point on my list, because it doesn't bother me that much, personally. The salary is not bad, and I didn't go into academia because I wanted to get rich. However, if your goal is to have money, then you will probably get much more of that if you go to industry with your qualifications.

In Germany, the salaries in the public sector (including universities) are determined by a class system. The class system is the same across all of Germany, regardless of how expensive life in the city where you live is. Munich is very expensive: if you would like to buy property with an academic salary, that could be problematic. During your PhD, you should expect to have very little money, despite doing a job that requires postgraduate qualification (i.e., a masters). In Germany, for jobs that require a masters degree, the standard salary class is E13 - the same salary class that is given to junior postdocs, and that already constitutes a decent salary. However, as PhD students are expected to get a shitty salary, someone came up with the disingenuous idea to pay them only part-time - normally between 50%-75% - while expecting them to work full-time.

10) Actually, I ran out of things to complain about. I'm sure I'd think of something more if I thought about it a bit longer, but what I have so far should already give some food for thought for aspiring academics (and maybe some big shot, who stumbled across this blog post and actually has the power to change some of the above?).

This blogpost is, of course, very negative, but that doesn't mean that I don't think there are no reasons to stay in academia (I'm still here, right?). Academia, like all working environments, has pros and cons. However, in my experience the pros of academia are often overhyped, and the cons are brushed aside as sacrifices that you need to make if you want to be a real scientist. Leaving academia is often seen as a failure, and considering alternative careers as a betrayal of your ideals. This mindset is incredibly damaging, as it allows for exploitation of people who have been brainwashed to simply be grateful that they have the opportunity to strive for an academic career. So, even for readers who are not weighing up the pros and cons of staying in academia, I hope that the blogpost shows that academia is not perfect, and that there might be upsides to considering alternative careers. This realisation will make (especially) early career researchers less vulnerable to being abused and guilted into staying in academia. 

To end on a brighter note, perhaps I'll get around to writing about "Ten reasons to stay in academia" for the next blogpost.

Wednesday, February 19, 2020

I would rather write an ERC grant proposal than buying lottery tickets after all

Scientists are supposed to be rational. Yet, at times, it feels like staying in academia is a completely irrational decision. The chances of success - a professorship, or gaining some prestigious grant - are low. The stakes - working overtime, dealing with high competition, occupational instability - are high.

The current blog post is inspired by a grant rejection. With a grant proposal, one invests a lot of time with a relatively small chance of success. In this sense, submitting a grant proposal feels like buying a lottery ticket. A very expensive lottery ticket. One that costs months of work, and has the unlikely scenario of the realisation of a project idea, and a red carpet towards a professorship as the potential gain. We know that buying a lottery ticket is irrational: the long-term expected value, the gain relative to the investment, is negative: if it wasn't negative for the consumer, there would be no gain for the organisers and therefore no motivation to run it.

If we take just the base rate of grant success rates (e.g., for the ERC), we will find that most grants have a higher success rate than lottery tickets (e.g., the Australian National Monday Lottery Ticket). However, the investment into an ERC grant is also much higher. A single lottery ticket costs AU$2.42, an ERC proposal takes months of work. This made me wonder if it's not worth, in the long run, to buy lottery tickets instead of writing proposals. This suggestion did not come off well with my boss.

So I decided to do the numbers to see if this is indeed the case. In the following I compare the expected value of buying an Australian National Monday Lottery Ticket compared to submitting a proposal for an ERC Starting Grant. How to calculate an expected gain for the lottery is explained here, and with a bit of researching I learned a bit about how the lottery works and managed to find or estimate the necessary values. As a simplification, we consider only the possibility of winning the jackpot (AU$1,000,000). The odds of getting a jackpot, according to the lottery's website, are 1 in 8,145,060. Thus, we have a success rate (p), the potential gain (V) and the cost of a ticket (C). To calculate the expected gain, we still need to estimate the number of people who buy this type of lottery ticket. I could not find this number on the official website, but I found the overall number of winners of a recent draw on a different page. Given that 58695 people won the last division, and that the probability of success of this division is 1 in 144, then, if I understand how the lottery works, we can estimate that around 8,452,224 lottery tickets are bought. Plugging these numbers in the formula below:

We get:



This means that, as we predicted, the expected value is negative: if we play the lottery, we expect, in the long term, to lose money.

Now, let's do the same thing for the ERC grant. Here, the success rate already takes into account the number of applicants, so we can use the simpler formula:


Here, we need to somehow estimate the cost of submitting an ERC proposal. Assuming two months of focussing only on writing the proposal, and my current salary, the cost comes to approximately 5,000 Euros. The immediate financial gain of the proposal is 1,500,000 Euros; for now, we don't consider the additional gain of 5 years' occupational security and high chance of a permanent professorship position afterwards. The probability of success, according to the ERC, is 12.7%. This gives us:

E = 0.127 * 1,500,000 - 0.8 * 5,000 = 186,500.

So, the good news is: The expected value of submitting an ERC proposal is not only positive, but with all the simplifying assumptions that we made, and assuming I made no mistakes in the calculations, it's also quite large! I don't fully trust my calculations. But even if my lottery calculations are wrong, there is the reasonable assumption that the expected value is negative. Furthermore, the expected value for the ERC Starting Grant is conservative, in the sense that it considers only the immediate financial gain: in reality, a success comes with additional gains. The estimation may be too optimistic, if the time spent writing the grant is underestimated in my calculations. However, keeping the other parameters constant, one would need to work for >7 years on the proposal for the expected value to turn negative.

There are some additional factors which may decrease the expected value. Perhaps the psychological loss associated with getting rejected again and again and again adds to the financial loss of spending time on the grant. It is up to each individual to decide how much rejection affects them, and whether it will tip expected value towards a negative one. Furthermore, the current calculations show that it's more rational to submit ERC Starting Grants than to buy lottery tickets for the rest of your life. However, that's a really low bar to set: the gains and losses associated with finding a stable, well-paying non-academic job could far outweigh the gains associated with applying for grants.

Of course, there is one important difference between lottery tickets and grants, at least in theory. A lottery is controlled by completely random processes: I'm no more or less likely to win the lottery than any person sitting next to me in the bus on my way to work (provided we both buy the same kind of ticket). Grants, at least in theory, are awarded based on merit, not based on a random number generator. Whether or not this is actually the case is a matter of debate. Still, me applying for a grant this year and me applying for a grant next year are not independent events: my chance of success depends on my grant writing skills, connection to the reviewers, how impressive my track record looks, and so on.

Is it rational to submit proposals with the hope to stay in science? Well, maybe. For me, at least, the thought that I have a greater chance of success in academia than if I were to buy lottery tickets and hope for the best is an encouraging one.

Thursday, January 2, 2020

A year in the life of a post-doc

In some ways, the year 2019 has been remarkably unremarkable for me. In the beginning of the year, I started on a DFG-funded project, so the most acute uncertainties associated with employment have been postponed for another few years. The beginning of the project is slow; data collection has just started, projects from the previous years are at various stages of non-completion and imperfection. 

Why would I even write a blog post about my year 2019, then? Well, every fail comes with a win, and vice versa. To demonstrate: The first win of 2020 is that we booked a nice hotel in the Austrian Alps for cross-country skiing. The first fail, associated with that (and the actual reason for writing the blog post): I came down with a cold, and need to stay in the hotel room. And, again, as an attached win, I get to enjoy this view:


So, without any further ado, here's a list of my 2019 win/fail pairs: 


Win
Fail
Finalised a manuscript
Rejected by 5 different journals
For a grant proposal, got together a team of amazing researchers from 13 different countries who agreed to collaborate with me on a cross-linguistic project
Proposal has a <10% chance of being funded
Started learning Natural Language Processing (and data science, and programming in general)
Still need to improve a lot before I can actually apply it in my research, or for looking for jobs outside of academia (New Year’s Resolution for 2020)
Started supervising my first PhD student
Issues with funding beyond the first year of her PhD (mainly due to stupid university bureaucracy)
Started teaching
Not sure how happy my students are with me hijacking their “research methods in clinical psychology” course and turning it into a course about statistics and open science
Learned that if I get an ERC starting grant, I’ll be guaranteed a professorship
Downloaded the manual for writing an ERC starting grant proposal, realised that the manual is 50 pages of densely written bureaucratese text, started wondering if I really want a professorship that much...
Data collection for new project going well, very competent research assistants
One of the testing laptops gave research assistant a strong electric shock and stopped working



Tuesday, September 17, 2019

How to make your full texts openly available

TL;DR: Please spend some time to make sure that the full texts to your articles are freely available, and remind your colleagues to do the same.


As it turns out, I'm a huge hypocrite. I regularly do talks about how to make your research workflow open. “Start with little steps”, I say. “The first thing you can do is to make your papers openly available by posting pre-prints”, I say. “It's easy!”, I say. “You do a good thing for the world, and people will read and cite your paper more. It's a win-win!”

I'm not at all an expert on open access publishing. I've been to several talks and workshops which provided an introduction to open access, and my take-home message is generally that it would be good to spend more time to really understand the legal and technical issues. So this blog post does not aim to give professional advice, but rather contains my notes about issues I came across in trying to make my own papers open-access.

There are multiple ways to find legal full-text versions of academic papers. Of course, there is also sci-hub, which – let's just say – when you average the legal and the moral aspects of it, is in a grey zone. In an ideal world, all of our research outputs would be available legally, and the good news is, that it's in the hand of the authors to make full texts available to anyone from anywhere. Self-archiving papers that have been published elsewhere is called green open access, and it's a good way to be open even if you are forced (by the incentive system) to published in closed journals.

Many people, even those who are not in the open science scene, use researchgate to upload full texts. I have created a researchgate account around the time I started publishing, and I have conscientiously uploaded every single article's full text right after I got the acceptance by a journal. Problem solved, I thought.

Then, I learned about the Open Access Button (openaccessbutton.org) and Unpaywall (unpaywall.org/). You can download both as add-ons to your browser, and when you've opened the link to a paper, you can click them to get the full texts. Below is a screen shot that shows what these buttons look like (circled in red); clicking them should get you right to the legal PDF: 

 
That is, if a legal, open-access version is available. In the screenshot above, the lock on the button of the unpaywall add-on is grey and locked, as opposed to green and open. If you click on the openaccessbutton in the top right corner, it takes you to a page saying that the full text is not available. This is despite the full text being available on researchgate.

Then, I decided to have a look at my record on google scholar. When one searches a paper, a link to any open access version appears next to the journal link. The screenshot below makes me look really bad: 

 
Though, to my defense, when we scroll down, it looks better:


Strangely, some of my full texts are linked via researchgate, and others are not, even though all full texts have been uploaded. The Collabra and Frontiers journals are open access by default: I did not need to do anything to make the full text freely accessible to everyone. The paper at the bottom is available through the OSF: I'd uploaded a pre-print at some stage when I'd given up trying to publish it.*

Still, when I go to the journal's link to my OSF pre-print paper, I cannot access the full text:



When you press the Open Access Button (the orange lock in the top right corner), you can request a full text from the authors. Alternatively, if it's your own paper, you can indicate this, and it will take you to a website where you can either link to a full text, or upload it. I tried uploading the full text to a couple of my papers. Open Access Button uploads the papers to Zenodo:


But, unfortunately, there seem to be some technical issues:


What seems to work, though, is the following:
  1. Uploading the paper as a pre-print on OSF, and
  2. Instead of uploading the pre-print through the Open Access Button, linking to the OSF pre-print.

Conclusion
An academic paper is our blood, sweat and tears. We want people to read it. We don't do our work only with the intention to hide it behind a paywall so that nobody can ever access it. I sometimes try to find full texts through my institution's library, and it often happens that I don't have access to papers. And I'm at a so-called “elite university” in Germany! Imagine how many people are blocked from having access to any publication if there are no open-access full texts available. And then ask yourself: What is the purpose of my work? Your work can certainly not achieve the impact that you hope for unless people can read about it.

So uploading pre-prints is definitely the right thing to do. After I realised my own short-comings, I am less impatient with authors when I come across a paywall when trying to read their papers. Making your work open access and findable is a bit more tricky than simply uploading the full text on researchgate. As a course of action, for each individual researcher, I would recommend the following:

  1. Check whether your publications have freely available full texts which are findable through google scholar, the open access button, and/or the unpaywall button. This is a good task for a Friday afternoon, when you've finished a task, but don't really have the time to start with something new. Or anytime, really. Making sure that people can read about your research is at least as important as conducting this research in the first place. It's part of our jobs.
  2. When you can't find a full text, email the corresponding author. The Open Access Button makes this easy. All you have to do is give a quick reason, and the author will receive the following email:

I believe you need an account to request an email to be sent to the authors on your behalf. Of course, you're also free (and strongly encouraged by me) to send an email yourself: the journal's link to the paper will have the corresponding author's email address. All you have to do is take the following template that I created: https://osf.io/fh73t/, fill in the blanks, and send it to the corresponding author.


---------------------------------------------------------
* I like to use this paper as a success story about posting pre-prints: The manuscript had been rejected by numerous journals, so I thought it will never be published. As I didn't want the work that I'd put into it go completely to waste, I uploaded the pre-print on the OSF. A few weeks (or even days) later, the pre-print appeared on google scholar; a few days after that, I got emails from two colleagues with suggestions of journals where I could try to submit this paper. I tried one of these journals, and a few months later, the paper was officially published, after only one round of minor revisions.





Thursday, September 12, 2019

Bayes Factors 101: Justifying prior parameters in JASP


TL;DR: Do you need a justification for your prior parameters in JASP? Scroll down to find fill-in-the-blank sentences which you can use, and a table where you can pick a range of effect sizes which you expect and the corresponding prior parameters.

----------------------------
With many psychologists turning Bayes-curious, softwares are appearing that make it easy to calculate Bayes Factors. JASP (Love et al., 2015) has a similar layout to SPSS, and allows the user to perform Bayesian analyses which are equivalent to a series of popular frequentist tests. Here, I will describe the priors which are implemented in JASP for two frequently used tests: the t-test and Pearson's correlation. I will also explain what we do when we change them. The aim is to provide the basis for a better understanding of what priors mean, and how we can justify our choice of prior parameters.

Both frequentist and Bayesian statistics rely on a series of underlying assumptions and calculations, which are important to understand in order to interpret the value that the software spits out (i.e., a p-value or a Bayes Factor). Given that very few psychologists have been schooled in Bayesian statistics, the assumptions underlying the Bayes Factor are often not intuitive.

One important difference between Bayesian and frequentist data analyses is the use of a prior. The prior represents the beliefs or knowledge that we have about our effect of interest, before we consider the data which we aim to analyse. The prior is a distribution which can be specified by the experimenter. This distribution becomes updated, once we have data, to give a posterior distribution. For calculating a Bayes Factor, we have two priors: one that describes one hypothesis (e.g., a null hypothesis: no difference between groups, or no correlation between two variables), and one that describes a different hypothesis. JASP then computes the probability of the observed data under each of these hypotheses, and divides one by the other to obtain the Bayes Factor: the degree to which the data is compatible with one hypothesis over the other.

To some extent, then, the inference depends on the prior. The degree to which the prior matters depends on how much data one has: when there is a lot of data, it “overrides” the prior, and the Bayes Factor becomes very similar across a wide range of plausible priors. Choosing an appropriate prior becomes more important, though, when (1) we do not have a lot of data, (2) when we need to justify why we use a particular prior (e.g., for a Registered Report or grant proposal), or (3) when we would just like to get a better idea of how the Bayes Factor is calculated. The aim of the current blog post is to provide an introduction to the default parameters of JASP, and what it means when we change them around, while assuming very little knowledge of probability and statistics from the reader.

T-tests
Let's start with t-tests. JASP has the option to do a Bayesian independent samples t-test. It also provides some toy data: here, I'm using the data set “Kitchen Rolls”. Perhaps we want to see if age differs as a function of sex (which makes no sense, theoretically, but we need one dichotomous and one continuous variable for the t-test). Below the fields where you specify the variables, you can adjust two prior parameters: (1) The hypothesis (two-tailed or directional), and (2) the prior (Cauchy prior width). Let's start with the Cauchy. The default parameter is set to 0.707. Contrary to what is often believed, this does not represent the size of the effect that we expect. To understand what it represents, we need to take a step back to explain what a Cauchy is.

A Cauchy is a probability distribution. (Wikipedia is a very good source for finding information about the properties of all kinds of distributions.) Probability distributions describe the probability of possible occurrences in an experiment. Each type of distribution takes a set of parameters, with which we can infer the exact shape of the distribution. The shape of our well-familiar normal distribution, for example, depends both on the mean and on the variance: if you look up the normal distribution on Wikipedia, you will indeed see in the box on the right that the two parameters for this distribution are μ and σ2. On the Wikipedia distribution pages, the top figure in the box shows how the shape of the distribution changes if we change around the parameters. Visually, the Cauchy distribution is similar to the normal distribution: it is also symmetrical and kind-of bell-shaped, but it has thicker tails. It also takes two parameters: the location parameter and a scale parameter. The location parameter determines where the mode of the distribution is. The scale parameter determines its width. The latter is what we're after: in the context of Cauchy priors, it is also often called the width parameter.

Back to JASP: when we change the Cauchy prior width, we don't change the mode of our distribution, but its width (i.e., the scale parameter): we are not saying that we are considering certain values to be more or less likely, but that we consider the range of likely effect sizes to be more or less narrow. The Cauchy, in JASP, is by default centred on zero, which gives us a bidirectional test. Overall, small effect sizes are considered to be more likely than large effect sizes (as shown by the general upside-down-U shape of the distribution). If we have a directional hypothesis, rather than shifting the location parameter, JASP allows us to pick which group we expect to have higher values (Group 1 > Group 2, or Group 1 < Group 2). This simply cuts the distribution in half. We can try this with our Kitchen Rolls data: If, under the section “Plots”, we tick “Prior and posterior”, we will see a figure, in addition to the Bayes Factor, which shows the prior for the alternative hypothesis, as well as the posterior (which we will ignore in the current blog post). The default settings show the following plot (note the symmetrical prior distribution):


When we anticipate that Group 1 will have higher values than Group 2, half of the prior distribution is cut:


And when we anticipate that Group 2 will have higher values than Group 1:


So, what do you do when you plan to use the Bayes Factor t-test for inference and the reviewer of the Registered Report asks you to justify your prior? What the Cauchy can tell us is how confident we are that the effect lies within a certain range. We might write something like:

The prior is described by a Cauchy distribution centred around zero and with a width parameter of x. This corresponds to a probability of P% that the effect size lies between -y and y. [Some literature to support that this is a reasonable expectation of the effect size.]”

So, how do you determine x, P, and y? For P, that's a matter of preference. For a registered report of mine, I chose 80%, but this is rather arbitrary. The y you pick in such a way that it describes what you believe about your effect size. If you think it cannot possibly be bigger than Cohen's d = 0.5, that could be your y. And once you've picked your y, you can calculate the x. This is the tricky part, though it can be done relatively easily in R. We want to find a the parameter x where we have an 80% probability of obtaining values between -y and y. To do this, we use the cumulative distribution function, which measures the area under the curve of a probability distribution (i.e., the cumulative probability of a range of values). The R function pcauchy takes the values of y, assuming a location parameter and a scale parameter, to get the probability that an observation randomly drawn from this distribution is greater than y. To get the probability that an observation randomly drawn from this distribution lies between y and -y, we type:

pcauchy(2,0,0.707) - pcauchy(-2,0,0.707)

This is for the default settings of JASP (location parameter = 0, scale parameter = 0.707). This gives us the following probability:

[1] 0.7836833

Thus, if we use the default JASP parameters, we could write (rounding the output up 0.78 to 80%):
The prior is described by a Cauchy distribution centred around zero and with a width parameter of 0.707. This corresponds to a probability of 80% that the effect size lies between -2 and 2. [Some literature to support that this is a reasonable expectation of the effect size.]”

An effect size of 2 is rather large for most psychology studies: we might be sure that we're looking for smaller effects than this. To check how we would need to change the scale parameter set to obtain an 80% probability (or any other value of P) to get the expected effect sizes, you can copy-and-paste the code above into R, change the effect size range (2 and -2) to your desired ys, and play around with the scale parameters until you get the output you like. Or, if you would like to stick with the 80% interval, you can pick the scale parameter for a set of effect size ranges from the table below (the percentage and the scale parameter are rounded):


Range of effect sizes (non-directional)
Range of effect sizes (directional)
Scale parameter required for 80% probability
-2 to 2
0 to 2 or -2 to 0
0.71 (default)
-1.5 to 1.5
0 to 1.5 or -1.5 to 0
0.47
-1.3 to 1.3
0 to 1.3 or -1.3 to 0
0.41
-1.1 to 1.1
0 to 1.1 or -1.1 to 0
0.35
-0.9 to 0.9
0 to 0.9 or -0.9 to 0
0.3
-0.7 to 0.7
0 to 0.7 or -0.7 to 0
0.22
-0.5 to 0.5
0 to 0.5 or -0.5 to 0
0.16
-0.3 to 0.3
0 to 0.3 or -0.3 to 0
0.1
The middle column shows what happens when we have a directional hypothesis. Basically, the probability of finding a range between 0 and y under the cut-in-half Cauchy is the same as the probability of finding a range between -y and y in the full Cauchy. I explain in a footnote1 why this is the case.

How does the choice of prior affect the results? In JASP, after you have collected your data, you can check this by ticking the “Bayes factor robustness check” box under “Plots”. Below is what this plot looks like for our age as a function of sex example. The grey dot marks the Bayes Factor value for the prior which we chose: here, I took the scale parameter of 0.1, corresponding to an 80% chance of effect sizes between -0.3 and 0.3.



After having played around with different parameters in R and doing the calculations above, E.J. Wagenmakers drew my attention to the fact that, when we choose the range width to be 50%, not 80%, the width parameter is equal to the range of values that we expect. So, if we are less confident about how big we expect the effect to be (and less keen to mess around with the different parameter values in R), we can simply write (below, I assume the default prior; if you have different expectations about the effect size, replace all mentions of the value “0.707” with your preferred effect size):

The prior is described by a Cauchy distribution centred around zero and with a width parameter of 0.707. This corresponds to a probability of 50% that the effect size lies between -0.707 and 0.707. [Some literature to support that this is a reasonable expectation of the effect size.]”

After having written most of the above, I also realised that I had not updated JASP for a while, and the newer version allows us to change the location parameter of the Cauchy, as well as its width. Thus, it is possible to change the mode of the distribution to the effect size that you consider the most likely. Then, you can calculate the new effect size range by taking the values from the table above, and adding the location parameter to the upper and lower bound, for example:
The prior is described by a Cauchy distribution centred around 0.707 and with a width parameter of 0.707. This corresponds to a probability of 50% that the effect size lies between 0 and 1.141. [Some literature to support that this is a reasonable expectation of the effect size.]”

You can find more information about location parameter shifting in Gronau, Q. F., Ly, A., & Wagenmakers, E.-J. (in press). Informed Bayesian t-tests. The American Statistician. https://arxiv.org/abs/1704.02479. For a step-by-step instruction, or in order to get hands-on experience with construction your own prior parameters, I also recommend going through this blogpost by Jeff Rouder: http://jeffrouder.blogspot.com/2016/01/what-priors-should-i-use-part-i.html.


Correlations
Now, let's move on to correlations. Again, our goal is to make the statement:

The prior is described by a beta-distribution centred around zero and with a width parameter of x. This corresponds to a probability of P% that the correlation coefficient lies between -y and y. [Some literature to support that this is a reasonable expectation of the effect size.]”

When you generate a Bayesian correlation matrix in JASP, it gives you two things: The Pearson's correlation coefficient (r) that we all know and love, and the Bayes Factor, which quantifies the degree to which the observed r is compatible with the presence of a correlation over the absence of a correlation. The prior for the alternative hypothesis is now described by a beta-distribution, not by a Cauchy. More details about the beta-distribution can be found in the footnote2. For the less maths-inclined people, suffice it to say that the statistical parameters of the distribution do not directly translate into the parameters that you input in JASP, but never fear: the table and text below explain how you can easily jump from one to the other, if you want to play around with the different parameters yourself.

The default parameter for the correlation alternative prior is 1. This corresponds to a flat line, and is identical to a so-called uniform distribution. Beware that describing this distribution as “All possible values of r are equally likely” will trigger anything from a long lecture to a condescending snort from maths nerds: as we're dealing with a continuous distribution, a single value does not have a probability associated with it. The mathematically correct way to put it is: “If we take any two intervals (A and B) of the same length from the continuous uniform distribution, the probability of the observation falling interval A will equal to the probability of the observation falling into interval B.Basically, if you have no idea what the correlation coefficient will be like, you can keep the prior as it is. As with the t-test, you can test directional hypotheses (r > 0 or r < 0).

Changing the parameter will either make the prior convex (U-shaped) or concave (upside-down-U-shaped). In the former case, you consider values closer to -1 and 1 (i.e., very strong correlations) to be more likely. Perhaps this could be useful, for example, if you want to show that a test has a very high test-retest correlation. In the latter case, you consider smaller correlation coefficients to be more likely. This is probably closer to the type of data that, as a psychologist, you'll be dealing with.

So, without further ado, here is the table from which you can pick the prior (first column), based on the effect size range (possible correlation coefficients) that you expect with 80% certainty:

JASP parameter (A)
Range of effect sizes (r)
Statistical parameters (a, b)
Statistical inputs (R)
1
-0.75 to 0.75
1, 1
0.125 to 0.925
1/3
-0.5 to 0.5
3, 3
0.75 to 0.25
1/7
-0.25 to 0.25
7, 7
0.325 to 0.625
10
< -0.75 or > 0.75
0.1, 0.1
1 - (0.125 to 0.925)

The R code (if you want to play around with the parameters and ranges), for the first row, is:
pbeta(0.925,1,1) - pbeta(0.125,1,1)
Note that the parameters in the R code are from the third and fourth column from the table above: to convert the statistical parameters of the code (here, a = b = 1) to the JASP parameter (A), you use the formula A = 1/a. To get from the correlation coefficient r to the input parameter to R (R), you use the formula: R = 0.5 + 0.5*r.

In the table, I have also included a parameter setting where you can specify that you expect, with 80% probability, that |r| > 0.75 (final row). Theoretically, you could use this, for example, if you want to compare the hypothesis that a test has a good test-retest reliability, though bear in mind that you would be comparing this hypothesis to one where the test-retest reliability is zero. In JASP, I realised after putting in this column, this will not be of any use to you, because the largest parameter that you can specify is 2.

As with the t-test, when you chop the beta-distribution in half (i.e., when you test a directional hypothesis), the upper bound (y) should be identical to the upper bound in the non-directional prior.


Conclusion
The blogpost aims to provide a psychologist reader with a sufficiently deep understanding to justify the choice of prior, e.g., for a Registered Report. If you've worked your way through the text above (in which case: thank you for bearing with me!), you should now be able to choose a prior parameter in JASP in such as way that it translates directly to the expectations you have about possible effect size ranges.

To end the blogpost on a more general note, here are some random thoughts. The layout of JASP is based on SPSS, but unlike SPSS, JASP is open source and based on the programming language R. JASP aims to provide an easy way for researchers to switch from frequentist testing in SPSS to doing Bayesian analyses. Moving away from SPSS is always a good idea. However, due to the similar, easy-to-use layout, JASP inherits one of the problems of SPSS: it's possible to do analyses without actually understanding what the output means. Erik-Jan Wagenmakers (?) once wrote on Twitter (?) that JASP aims to provide “training wheels” for researchers moving away from frequentist statistics and SPSS, who will eventually move to more sophisticated analysis tools such as R. I hope that this blogpost will contribute a modest step to this goal, by giving a more thorough understanding of possible prior parameters in the context of Bayes Factor hypothesis testing.

-----------------------------
I thank E.J. Wagenmakers for his comments on an earlier version of this blog post. Any remaining errors are my own.
-----------------------------
Edit (17.9.2019): I changed the title of the blogpost, to mirror one that I wrote a few months ago: "P-values 101: An attempt at an intuitive but mathematically correct explanation".
-----------------------------
1 If we simply cut the Cauchy distribution in half, we no longer have a probability distribution: a probability distribution, by definition, needs to integrate to 1 across the range of all possible values. If we think about a discrete distribution (e.g., the outcome of a die toss), it's intuitive that the sum of all possible outcomes should be 1: that's how we can infer that the probability of throwing a 6 is 1/6, given a cubical die (because we also know that we have 6 possible, equiprobable outcomes). For continuous distributions, we have an infinite range of values, so we can't really sum them. Integrating is therefore the continuous-distribution-equivalent to summing. Anyhow: If we remove half of our Cauchy distribution, we end up with a distribution which integrates out to 0.5 (across the range from 0 to infinity). To change this back to a probability distribution, we need to multiply the function by a constant, in this case, by 2. If you look at the plots for the full Cauchy prior versus the directional priors, you will notice that, at x = 0, y ≈ 0.5 for the full Cauchy, and y ≈ 1 for the two truncated Cauchys. For calculating the probability of a certain range, this means that we need to multiply it by two. Which is easy for our case: We start off with a given range (-y to y) our full Cauchy and cut off half (so we get the range 0 to y), so we lose half of the area and need to divide the probability of getting values in this range by half. Then we multiply our function by 2, because we need to turn it back to a probability distribution: we also multiply the area between 0 and y by two, which gives us the same proportion that we started with in the first place.

2 A beta-distribution looks very different to a Cauchy or normal, and takes different parameters (on wikipedia, denoted α and β). When both of the parameters are equal (α = β), the distribution is symmetrical. In JASP, you can only adjust one number: the same value is then used for both parameters, so the prior distribution is always symmetrical. The number which you can adjust in the JASP box (let's call it A) does not equal to the true parameter that defines the beta-function (let's call it a), as it has an inverse relationship to it (A = 1/a). The other difference between the actual beta-distribution and the JASP prior is that the beta-distribution is defined for values between 0 and 1: the JASP prior is stretched between values of -1 and 1. Thus, when using the function to calculate the probabilities of different ranges of r under different parameters, we need to transform r to to a value between 0 and 1 before we can make a statement about the size of the correlations. I hope to make this clearer when I present the table with parameters and effect size ranges.

Thursday, August 8, 2019

On grant proposal writing


The year 2018 was very successful for me in terms of grants: My success rate skyrocketed from close to 0% to 100%. It’s a never-ending story, though, so now I’m finding myself writing even more grant proposals, which led me to procrastinate and write a blog post about grant proposal writing. Given my recent successes, I could frame this blog post as a set of advices for other aspiring grant writers. However, frankly, I have no idea why my success rate changed so abruptly. Also, I don’t really want to sound like this guy. 

Nevertheless, I got a lot of advice from different people about grant writing over the years. Maybe it can be useful to other people. It will also allow me to organise my own thoughts about what I should consider while writing my proposals. So, here goes:

Advice # 1: Be lucky. Even if your proposal is amazing, the success rates tend to be low, and many factors aside from the grant quality will affect whether it is successful or not. You may want to repress this thought while writing the proposal. Otherwise the motivation to invest weeks and months into planning and getting excited about a project will plummet. However, as soon as I submit the proposal, I will try to assume an unsuccessful outcome. First, it will motivate me to think about back-up plans, and second, it will weaken the bitterness of the disappointment if the funding is not granted.

One aspect where luck plays a large role is that a lot depends on the reviewers. In most schemes that I have applied for, the reviewer may be the biggest expert in the field, but they may also be a researcher on a completely different topic in a vaguely related area. So a good grant proposal needs to be specific, to convince the biggest expert that you have excellent knowledge of the literature, that you have not missed any issues that could compromise the quality of your project, and that every single detail of your project is well-thought-through. At the same time, the proposal needs to be general, so a non-expert reviewer will be able to understand what exactly you are trying to do, and the importance of the project to your topic. Oh, and, on top of that, the proposal has to stay in the page limit.

Throughout the last years, I have received a lot of very useful advice about grant writing, and now that I’m trying to summarise it all, I realise how conflicting the advice sometimes is. I have asked many different people for advice, but most of them are regularly involved in evaluating grant proposals. This is one demonstration of how important luck is: Maybe you will get a grant reviewer who expects a short and sexy introduction which explains how your project will contribute to the bigger picture of some important, global social problem (e.g., cancer, global warming). Maybe you will get a reviewer who will get extremely annoyed at an introduction which overblows the significance of the project.

Advice #2: Think about your audience. When I search for possible reasons for my abrupt change in success rate, this is a possible candidate. The advice to think about one’s audience applies to everything, and it is widely known. However, for a beginning grant writer it is sometimes difficult to visualise the grant reviewer. Also, as I noted above, a reviewer may be the biggest expert in the field, or it could be someone who doesn’t know very much about it. Thus, in terms of the amount of detailed explanations that you put into the proposal, it is important to find the right balance: not to bore the reviewer with details, but provide enough details to be convincing. The prior probability of the reviewer being the biggest expert is rather low, if we consider that non-experts are much more common than people who have very specialised knowledge about your specific topic. Thus, when in doubt, try to explain things, and avoid acronyms, even if you think that it’s assumed knowledge for people in the field.

Reviewers are, in most cases, academics. This means that they are likely to be busy: make the proposal as easy-to-read as possible. Put in lots of colourful pictures: explaining as many things as possible in figures can also help to cut the word count.

This also means that they are likely to be elderly men. This realisation has brought up a very vivid image in my mind: if the proposal is ‘good’, the reviewer is should come home to his wife, and, while she passes him his daily glass of evening brandy, he will tell her (read this in a posh British accent, or translate in your head to untainted Hochdeutsch): “My dear, I learned the most interesting thing about dyslexia today…!”

Advice #3: Get as much feedback as possible. Feedback is always good: I try to incorporate everything anyone tells me, even if in some cases I don’t agree with it. Thoughts such as “Clearly, the person giving the feedback didn’t read the proposal thoroughly enough, otherwise they wouldn’t be confused about X!” are not very helpful: if someone giving you feedback stumbles over something, chances are that the reviewer will, too. Sometimes, the advice you get from two different people will conflict with each other. If at all possible, try to find a way to incorporate both points of view. Otherwise, use your best judgement.

Most universities have an office which helps with proposal writing: they are very helpful in giving advice from an administrative perspective. Different funding agencies have different requirements about the structure and the like (which is also why I’m trying to keep the advice I summarise here as general as possible). Grant offices are likely to give you good advice about the specific scheme you are applying for. They may also allow you to read through previous successful applications: this can be helpful in getting a better idea about how to structure the proposal, how to lay-out the administrative section, and some other issues that maybe you missed.  

Colleagues can give feedback about the content: they will point out if something is more controversial than you thought, if there are problems with some approaches than you have not thought about, and provide any important references that you may have missed. Ask colleagues with different backgrounds and theoretical ‘convictions’. Friends and relatives can help to make sure that the proposal is readable to a non-expert reviewer, and that the story, as a whole, makes sense.

Conclusion
In some ways, submitting a grant proposal is a lot like buying a lottery ticket that costs a lot of time and your career probably depends on it. However, it is also the daily bread of someone striving for an academic career, so it is important to try to make the best of it. In an attempt to end this on a positive note (so I feel motivated to get back to my proposal): Applying for ‘your own’ project may give you the flexibility to work on something that you really care about. It takes a lot of time, but this time is also spent on thinking through a project, which will make its execution run more smoothly afterwards.

The advice above is not comprehensive, and from my own biased view. I would be very happy to read any corrections or any other advice from the readers in the comments section.