Critical Code Studies from Afar

As “some” people have commented (thanks LK), I have missed my once-a-month mark of updating my journal by quite some time. Reason being that I was occupied by a conference, that went on the entire last week. I applied with a one-day workshop on computational methods to study source code. I collected a few important takeaways from the workshop, and one personal one.

The conference in question was DHd 2025, the largest conference of the German-speaking Digital Humanities, which invites scholars from Switzerland, Germany and Austria every year. I was happy to be present with my workshop, because, besides it, there were no topics of interest to me present. This way, I was at least able to meet and talk with people, that share an interest in working with source code. That is not to imply that the conference was boring or not of any relevance. It just means, that I’m digging in some niche corners. (Reflection)

I started organising the workshop “Quellcodekritik aus der Ferne” (page 70) some month ago. I ended up documenting most of it on a dedicated repository a dedicated repository (which is in German of course). Roughly summarized, the workshop had four blocks - two with theoretical inputs, and two for practical exercises. I was lucky enough to persuade Vera Piontkowitz, Stefan Höltgen, and Daniel Gammenthaler to collaborate and bring in their expertises and perspective. Without them the workshop wouldn’t have been half the fun. (Takeaways)

Reflection

Despite having a good time at the conference, it cost me a massive amount of energy. Disruption of routines, sleeping in another place, being social all day long, not eating properly, sitting too much left with more-than-average migraine. I tried to balance that out with engaging with the conference as little as necessary. After my workshop, I visited only one or two panels per day (out of up to four). I walked a lot and tried to enjoy the sun as well.

Nonetheless, I wouldn’t have missed it. I got much good feedback after the workshop, and it opened the door to some serious interesting discussions. At one moment I felt ashamed to admit feeling like I probably profited the most from the day I organized. People assured me that this counts as a very good outcome, so I take that as something positive. I have a bunch of Takeaways that I need to weave into my research now.

The thing that cost me most of my energy was myself. As usual, I became “stressed” weeks in advance. Something I experienced as something negative again and again, and which actually led me to get checked for mental health after finishing my Master. Only this time I figured that I’m not really stressed. At least not in the sense of being late for the train, having lost my keys and trying to find them, knowing that friends are about to become parents, or briefly before a presentation. Being able to reflect on it, the weeks before the workshop felt like a negative hyperfocus. The workshop and its organization occupied my focus from the moment I got up until I went to bed. I talked with a neurospicy friend and the description of a negative hyperfocus resonated a lot with them. So I guess that’s that for now.

I’m quite happy with the outcome of the workshop. The scheduling and scope of the inputs worked very well.

I gave a basic outline of what are critical code studies and what could be meant with a distant reading of source code. Since we don’t have any tools and approaches ready yet, this aspect was kind of a hypothesis.
Very Piontkowitz (vp) gave insights into how multimodal distant reading and viewing is applied at the University of Leipzig. Her perspective was essential to give a direction of what a distant methodology means, besides working with text.
Stefan Höltgen (sh) talked about what kind of text source code actually is, and what some aspects of importance are. He also presented his ongoing work with printed listings.
Daniel Gammenthaler (dg) brought in a critical view on source code studies, outlining how most digital-born artefacts are actually in a non-source-code state, implying the difficulties of studying such artefacts.

The first practice block worked quite well along the line of engaging on a qualitative level with source code. We worked in groups on

the Morris worm
the source code of VICE
BASIC listings in old computer magazines
walkthroughs

Engaging the participants with a qualitative reading of source code took of the rough edge and showed them, that code actually can be a source for humanistic or heuristic inquiry. That is a fundamental point in critical code studies.

The second practice block worked out as well, but I had a hard time preparing adequate material and exercises, since I didn’t know how it will actually attend. It turned out that most participants where in the corner of simply being interested. A key point of the discussion that I wanted to lance, is the problem of gaining technical expertise in order to study source code. Which isn’t the easiest feat to accomplish. The exercises I prepared were

Investigating source code through AI/LLMs (which I was super hesitant to propose, since there are ethical issues in working with AI models)
Getting some basic proficiency with regex, the key approach to deal with patterns in source code
Working with my python scripts from Distant Reading The Vice Source Code

Most of the participants went for working with AI/LLMs, which I didn’t create cause for celebration, but at least it gave them some positive experiences. Some went for the regex which made me very happy. Working with regex is like doing taxes - always a moment of joy.

Last but not least, I need to work on my transitions. Being low in energy, I wasn’t able to lead from block to block, or into the discussions before lunch and as a final wrap up. That is something I can improve for a next time. It also left me unsure if the participants gained much from the workshop. But, after somebody told me they want to include the workshop material in their teaching, and somebody else already organising a follow-up event, I can be sure that it worked for at least some of them.

Takeaways

Source code never exists by itself, but is one of the many modi or aggregates that software can take form in. Source code is the plain text description of processes that run on an electronic machine with aesthetic and kinetic aspects.
When working with source code, we need to be transparent with our premises. What is it that interests us, what do we assume?
Relevant aspects and parameterization of stylistic properties (sh)
- Programming languages are languages (syntax, semantic, pragmatics).
- Source code is…
  - …a type of text.
  - …exists on different substrates.
  - …is a directive act of speech, as it intends to achieve change in the world.
Source code has aesthetic aspects. It needs to be understood in two-dimensional space. See e.g. whitespace in python or special syntax like : in BASIC to have several commands per line to maximize screen usage. This point is insofar relevant as it adds another dimension to a “distant” approach to studying source code. Not only structure and pattern matters, but the visuality of code as presented on its substrate.
After the workshop I was wondering if “distant” is a fitting term for this approach. It might be too loaded already, with what we know from distant reading/viewing. It’s clear that studying source code needs an approach that is reflected on its Technicity and depends on the use of programmable tools.
Research questions are (as always) more important than corpus/datasets and methods. That point came up several times during the workshop. It’s especially relevant in regard to the scope of source code, which can grow quickly beyond what we can inquire qualitatively.
A growing collection of methodological considerations: scalable viewing, genetic analysis (phylogenetics), network analysis, parameterization, visualizations, versioning.

Quellcode Kritik aus der Ferne?

Following a list of condensed questions that I deem relevant for my dissertation.

What is a meaningful unit of code?
What am I looking at/interested in: processes, results and outputs, practices?
What does distant mean in my context, and is there a better term?
Where is the threshold between creating access to source code, and removing the ability to critically reflect on tools and technicity?
How can a research question be translated into programmable tools?