How do you become good at "doing" science? Less of a weird question than you might think at first glance. If you are a scientist, most of your work and education has mostly focused on producing good science, making sure that the result stands up to the expectations of your field and maps well to reality. But do you get there? In the social sciences, most attention goes to teaching the appropriate techniques for statistical analysis, proper randomization in experiments, how to ask questions the right way in interviews or (more recently) improving reproducibility.
What we lack, however, is guidance on the praxis of doing all these things, of sitting down in front of a computer day after day and doing the things that hopefully result in good science. This is in contrast to many other fields where improving the production process is a major field of activity.
If you have talked to me in person for anything longer than ten minutes, it will be no surprise to you that I am fascinated by order and efficiently running processes. Among my friends it has become somewhat of a trope to tease me about my inclination to measure stuff, automate things and doing everything the "data" way, which I can't begrudge at all. I am weird that way. And since starting my PhD, this has kicked into overdrive and I have spent a good deal of time thinking about the process of doing science itself. I wrote about that in my post about One citekey to bind them and more recently in Science As Pull Requests. In the last half year or so I've discussed a couple of other aspects with people, and this post will be a first stab at ordering my thoughts in writing.
Every product has a process, a sequence of steps, that leads to its creation. Academic work is no different. And if you want to improve a product, one (the?) major lever you can use is improving the process by which you make the product. One of the most famous examples is the Toyota Production System, an integrated socio-technical system, which is credited with Toyota's success in creating high volumes of cars with consistently great quality over the years. Or, in a useful quote by Alfred North Whitehead:
Civilization advances by extending the number of important operations which we can perform without thinking about them.
The better your process is, the more automatically it flows without endless jumping around and re-adjusting, the better your outcomes. But, at least in the social sciences, there is little discussion about this process, in particular for the individual researcher sitting in front of the computer. That's not to say there is no discussion about how to be better as a scientist or how to improve science. But most advice is specific to singular topics, such as writing or statistical analysis, two name just two. But I haven't seen many examples where the process of producing scientific output is treated systematically from start to finish as a process that individuals take part in every day in front of their computer (with some notable exceptions like Raul Pacheco Vega's Blog, Keiran Healy's The Plain Person’s Guide to Plain Text Social Science or Inger Mewburn's Thesis Whisperer).
So, how does the academic process look like on a personal level, as the person who is doing the work? My thoughts about this are certainly biased towards the perspective of a quantitative social scientist (let me know how the process differs in your field!), but I do think they cover most writing-based knowledge work. When I think about doing science, I tend to come up with the following steps:
- Reading and understanding what I read
- Remembering what I have read
- Referencing and citing what I have read
- Data Gathering
- Data Processing
- Data Analysis
- Gathering Feedback
and, as a meta-step that ties all these steps together:
- managing myself during a day and from day to day.
Let me explain my thinking behind some of these steps in a couple of sentences. You'll note that I have more thoughts about some than others, and I'm very happy to hear your thoughts on any of them (email me!).
1. Reading and understanding what I read
All creation of new knowledge rests on previous knowledge, so you need a way to acquire that knowledge. But how do you do that? How do you read...well? How do you make sure you understand what you are reading?
I think it‘s obvious that improving this step of the process has very large dividends down the line: the more you know and the better you understand what you know, the better your foundation for creating new knowledge.
This applies equally to spoken conversation. The techniques might differ, but the concepts are the same: taking in new information and understanding it.
2. Remembering what I have read
So I have read something and understand it right now, but how do I make sure I don't have to read the same information over and over again? How do I make sure I don't have to look things up in my own notes too often and when I do that I find the things I am looking for as fast as possible?
After all, the less time I spend looking things up or searching for them, the less frustrated I am and the more I can actually write.
3. Referencing and citing what I have read
Now I have read, understood and remember things, or can at least look them up reasonably fast. Yet at least in academia we not only have to build on previous knowledge, we also have to attribute it. This is often a major hiccup in the research process: I know that I read that in this one paper...but where did I save that? Do I have the right citation information? Now I also have to format my citations according to always diverging standards depending on where I submit my work for publication. How do I do that most efficiently?
4. Data Gathering
Whether that's your own data you collect, you use other people's data or a mix of the two: this is a process with very high importance for the final work. Documenting your choices, what you collect, and how you store your data are all sub-processes you can and should optimize for the particular setting you are doing them in.
5. Data Processing
How do I make sure I get my data into the form I need for the analysis? How do I make my processing easily replicable and transparent? How do I store and manage my data? I have book-length thoughts on this, and will write much more about this. This is an area we as social scientists do not get taught about enough in our standard curriculums.
6. Data Analysis
Possibly the shortest step, but you can still optimize it. How do I make sure I know about the right tools and frameworks to use? Are there things I can do in processing that make analysis easier?
Now that I have found a result I think is important, how do I get it onto paper? How do I order my thoughts, how do I structure my writing, how do I make sure what I write is convincing? How do I make this process as seamless as possible so that I can spend all my energy on actually writing and not fighting the tools I'm using?
This is a topic I have discussed a lot with colleagues and friends because I struggle a lot with putting my thoughts on paper. More on my solutions to this soon.
How do I re-work something I've already written? How do I make sure I keep the argument together when ripping things out and moving stuff around? How do I keep track of changes so I can be sure I lose nothing I write and I can change my text to my heart's content?
9. Gathering Feedback
How can I make gathering feedback maximally productive? How do I present my work properly so that others understand it and can give helpful thoughts? How do I record their feedback? How do I incorporate the feedback into my own writing?
How does my edited writing become the final product? How can I make this as easy as possible on myself and others so I don't spend time manually correcting things over and over again? How do I market my product and ensure that people read it?
11. managing myself during a day and from day to day
Last but most definitely not least, the meta-step that ties all the previous steps together: how do I steer myself through this process? How do I make sure I keep going when I don't immediately see progress in my work? How do I connect my long-term vision to my short-term work of writing one paragraph, doing one interview, responding to one email?
Academia in particular is a field where feedback is rarely immediate, and the delayed gratification of publishing something after often years of work can make it very hard to not feel you're always behind and treading water.
Update: You can find my thoughts on managing yourself while you are working here.
Every one of these steps can be split into smaller steps and sub-processes, as you can see. Academia is stressful because of that, and I think it can get better if we think, for ourselves and a community, through the individual steps and how we can make them better for us. This will produce better science and a better work-life balance for us as individual researchers or knowledge workers.