Pasar al contenido principal
Untitled, by stournsaeh (2019) Image courtesy:

This piece was first published on Deep Dives as a part of the series "Bodies of Evidence". 

Kate Sim is a DPhil student at the Oxford Internet Institute researching the datafication and automation of sexual harassment reporting systems in US higher education. Kate has nearly a decade of experience in community organising, survivor advocacy and social policy in the US, UK, and South Korea. As an undergraduate student at Harvard University in 2014, she was part of a national campaign that informed the White House Task Force created to address campus sexual violence. In this interview, she takes a deep dive into some of her research.

Welcome to Deep Dives, Kate! Could you start by giving us a broad sense of how digital technologies are trying to deal with sexual harassment and assault?

Over the last few years, I’ve been seeing what are broadly termed ‘anti-rape technologies’ in popular media popping up — both on my campus and at the national level. From wearable panic buttons to reporting softwares, a ‘cottage industry’ of campus safety technologies has been in development. As I saw more and more of these, I began to get interested in software systems that cater to reporting issues and what they promise to do — the ways in which they promise objectivity and trustworthiness in building systems that are encrypted and third-party driven, rather than police or campus administration driven.

In my research, I’ve identified two models of data-driven reporting tools. The first model essentially looks a lot like a case management system. Usually created by third-party vendors, these tools allow the user to report any number of safety issues to the appropriate campus safety administrator. These issues can range from bike theft and noise complaints to hazing and sexual harassment. Whatever the issue may be, the goal is to get the right authorities informed as soon as possible. This model sees sexual harassment as just one of many potential data points. This is what I call a universalising approach to campus safety.

The second model takes a partial and exceptionalising approach towards sexual harassment. Systems designed in this mode exceptionalise sexual harassment, especially sexual assault, as uniquely difficult. It is uniquely difficult for users to come forward and talk about their experience and uniquely difficult for campus authorities to deal with it in a confidential manner.

‘El amigo desconocido’, by Xavi Garcia (2017) Image courtesy:


So how do these data-driven reporting systems actually work?

Some products in this second model are based on the ‘information escrow theory’ by Ian Ayres, a Yale University professor. This theory suggests that individuals with socially valuable but controversial information (say, whistleblowers and in our case, survivors) face a unique barrier to coming forward because the first person to do so faces significant social stigma. This is the ‘first mover disadvantage’. There’s tremendous disadvantage to being the first person to articulate socially undesirable information, like whistleblowing or making allegations. What this theory proposes is that if we can lessen that disadvantage by ensuring that the burden is shared and spread across, then it might incentivise people to come forward earlier, and enable more people to come forward. Earlier disclosure and more reports would raise the credibility of the claim being made.

Data-driven reporting systems try to do this by using encryption and a matching algorithm. Think of it like a dropbox. They essentially create a third-party dropbox to collect all allegations of sexual harassment. A matching algorithm runs through these allegations to identify any repeat offenders.

For example, say User 1 submits a record to the system saying that they had a really uncomfortable exchange with Professor A. User 2 then separately submits a record alleging sexual harassment by Professor A. The system will store both these records independently. But once it identifies Professor A as a repeat offender, it’ll ping both users individually to say: ‘Hey, our system has told us that there’s been another incident involving your offender. Would you now consider reporting this formally?’ That’s the broad gist of these systems.

What are the logics on which these reporting softwares are built?

There are two threads to the underlying logic of these systems. One is an assumption that reporting is an actionable solution. There’s this idea that if we can get people to report more, then it’ll give us more data. More data can become better data, and better data can become good data. According to this assumption, incentivising more reporting is a solution campus authorities can act upon because it gives them more data.

The second thread of logic is that through collecting more data, we can make sexual harassment knowable. And if we know enough, then we can predict it, prevent it, and address it.

With these guiding assumptions in place, the propelling logic sees sexual harassment as a credibility problem. The problem is that survivors are not reporting, and they are not reporting because they are not believed. So if survivors can be believed, then they’ll report more; and if they report more, then there will be more data so that campus authorities can prevent, predict, and respond. To frame the problem as a credibility problem (which is not untrue) is to see it as a data problem. If we just have more information, then we can act upon it. If we can act upon it, then it’s going to solve the credibility deficit that victims often face. We see how circular this logic can be.

But this is a fundamental misdiagnosis of why survivors don’t report. Even if survivors were to be believed, there are so many reasons why people would not report. Reporting is expensive, time-consuming, laborious, and can be re-traumatizing. If the incident involves witnesses and friends, they might not want to get so many people involved. If family members, partners or children are involved, they might not want to compromise their relationship by reporting. There are just so many reasons why people would not consider reporting as an appropriate or even ideal remedy. Which is not to say that they wouldn’t go about seeking help in other ways. It just means that reporting to the authorities may not be the ideal solution, even if we were to address the credibility problem.

Untitled, by stournsaeh (2019) Image courtesy:


You’ve really given us something to chew on. Digging a layer deeper, what values and assumptions, either around sexual assault and harassment, or around data, design and technology, do these softwares express?

System vendors will say that their products are a great all-purpose reporting tool. But when you actually look at the kinds of questions that are asked and the kinds of reporting choices that are offered, we see that by ‘sexual harassment’ what they really mean is heterosexual, penetrative sexual assault. This leaves out intimate partner violence, cyber harassment, stalking and other forms of non-sexual violence that are actually integral parts of how people experience gender-based power imbalances. These forms of violence are not actively considered by such systems, even though they promise to be amenable to all of these experiences.

Another assumption that these systems make is that gender is isolated from other identity categories. While some of these systems ask for other identity-based information — sometimes about the perpetrator, sometimes about the user — they largely see sexual harassment as a discrete incident that is isolated from other forms of social identities.

For example, you may be a woman of colour who experiences a peer in your classroom making a series of comments about your hair or your dress in a very racialised way — and this precedes the student making advances towards you at a party one day. That’s an accumulation of long-term incidents that add up to harassment that is simultaneously racialised and sexualised.

But these systems, because they are data-driven and data-oriented, cannot quite comprehend that people’s lives are multiplicitous, that identities are complicated, and that harassments are often experienced as an accumulation of incidents rather than a single one.

That’s a super-important point. Does this mean that only some cases of sexual harassment get space and credibility in these systems?

I have two layers of response to that. First is that these softwares provide differential opportunities for credibility, so certain survivors’ narratives are privileged over others. If you are a user whose experience of violence easily maps onto the system’s assumptions about violence, then great. You have a system that provides you with the right questions, a credible form complete with a time-stamp, and various design choices to make your complaint more credible than one delivered as a journal entry or a typed-up Word document. The system rewards users whose experiences conform to the system’s internal logic.

But which kinds of survivors and what kind of violence are rewarded by the system? This brings us back to the question of credibility — how we allocate credibility differentially to certain bodies and stories, but not to others. To explain this, feminist philosopher Miranda Fricker’s theory of epistemic injustice can be really helpful. Fricker shows that social stereotypes help us decide who and what we choose to believe. For example, a stereotype I often heard growing up is the trope of the Asian student who is good at math. Some would argue that this is a positive stereotype because it is in favour of Asian people. Others would argue that this is a negative stereotype because it makes a troubling association between Asian people and math. Positive or negative, this stereotype influences our social and material world. So if I were to post a video of me giving a Calculus lecture on Youtube, viewers might take my lecture seriously even though I know nothing about Calculus! Note how the stereotype facilitated viewers’ decisions: they saw me as a credible source and so they took me seriously and believed me. Because our social world is organised by identities like race, gender and class, we see how these identities influence whom and what we choose to believe. In other words, identity categories shape credibility.

‘Wanderer’, by Tobias Kroeger (2016) Image courtesy:


So how do such social stereotypes tie in with these reporting softwares?

When it comes to sexual violence, the kinds of qualities associated with women — that women are hysterical and duplicitous — give less credibility to women. ‘We can’t trust her because she’s lying’ or ‘she’s exaggerating’. We’re familiar with this kind of victim blaming mentality.

Reporting systems acknowledge this problem of credibility. Their founding assumption is that if victims can be believed, they’ll report more. I think this assumption is partly made because these reporting tools are often created by survivors themselves. So they understand firsthand the role credibility plays in shaping victims’ experience with reporting and seeking help.

But at the same time, to only look at credibility to understand sexual violence glosses over other really important factors. Consider the raced sexual politics of lynching in the US. Allegations made by white women of sexual aggression from black men justified the lynchings of countless black men. This history teaches us how race and gender work together to position certain bodies as innocent and others as dangerous.

It also cautions against thinking about sexual violence only through the lens of gender and credibility. When we say ‘believe women’ or ‘believe survivors’, what kind of women, what kind of survivors, and what kind of believing are we talking about?

In other words, there are tensions between the stories that survivors tell and the data that is recorded.

Exactly. When do stories become data, and when does data become stories? How technology companies use qualitative and quantitative data is interesting here. Reporting system vendors use qualitative data such as testimonials, and quantitative data such as infographics or prevalence statistics about campus sexual assault. At first these seem very different, but when we examine how they are used and for what ends, we see that they’re not that different after all.

For example, a common prevalence statistic used to talk about campus sexual assault in the US is the figure that ‘1 in 5’ college women experience sexual assault before they graduate. For some, that stat seems really small; for others, it seems really big. The accuracy of this figure aside, it’s interesting to see both advocates and critics of campus sexual violence use this stat for different ends. App vendors, too, use this in different ways. Some support it, others critique it.

Consider also how companies use qualitative data. Vendors often share one exceptional story of how a user saved herself from a potential assault by using the app. Well, how often does this happen? How many people actually use this app? Are there other reporting options that could have been better or worse? When we ask these questions, we can see that the testimonial, much like the infographics and stats, serves to legitimise the app.

In both cases, we see data becoming stories and stories becoming data. Rather than disputing the accuracy of each, I am interested in asking what explanatory work is being done when app companies use different kinds of data.

‘#MeToo’, by Lauren Mitchell (2016) Image courtesy:


You mentioned that sexual assault survivors have been involved in building some of these reporting softwares. Has this influenced the design, logic or thinking behind these softwares in any way that differentiates them?

Absolutely. A lot of these systems are actually designed by survivors themselves, whether it’s the CEO or the engineers. In my fieldwork, I’ve certainly seen people from all the different aspects of product development who are either themselves survivors, or have had their dear ones affected by sexual violence. So these technology developers do have direct and real experience of sexual violence.

But I want to draw attention to the fact that even our experience of violence, as horrible as it may be, is always a partial experience. Someone who has experienced sexual assault at a party by an unknown person may not actually be able to fully understand the experience of being battered by an intimate partner. Those are two very different cases. We group them together as gender-based violence because gender is a really important part of understanding how and why those things happen the way they do. But would a survivor in one situation necessarily be able to fully understand and relate to the experience of another? A shared experience of violence may give us shared reference points, but it doesn’t guarantee an understanding or solidarity.

In technology design, there’s been a movement towards co-designing, which aims to bring in people from affected communities to be a part of the design process. And it has been really powerful in many ways. Even just comparing the incumbent or universalising model and the newer or exceptionalising model, you can already see the attention that’s paid to the design and how much they try to really do right by their communities. We see the ways in which educated university students might be part of the user research phase of product development, whether through online forums or through in-person group observations by company user researchers. Those are some of the points at which survivors can have some influence on the design and how designers think about their product.

But designing the interface is just one part of the life cycle of building technology — and designing for a particular form of sexual violence is just one part of understanding the depth of how gender-based power relations can manifest as sexual harassment.

‘Women supporting women’, by Monica Garwood (2018) Image courtesy:


Let’s talk about features. What are some of the most promising features that you’ve seen in reporting softwares?

I would bring attention to two features in particular. One is a digital version of a forensic interviewing technique called progressive disclosure. Rather than saying to a survivor, ‘Tell me everything that happened in this one time,’ progressive disclosure asks you different questions, dividing up who, what, when, where and why, trying to separate sensory details and facts from experience.

Using progressive disclosure means that the survivor has greater autonomy to reflect on their experience and be able to put together an account of what happened in a more robust period of time, rather than having one single intimidating appointment with a police officer to remember everything. Most people can’t remember what they ate a week ago, so add to that the difficulty of remembering traumatic experiences and it’s only reasonable that survivors can’t remember details. Using online forms can really help with practicing progressive disclosure.

The second feature that I want to highlight is the ability to explore remedies. I wouldn’t say that existing reporting tools do this particularly well. What I do hear from survivors is that they like knowing the full life cycle of reporting or seeking help. ‘What happens if I do this, what happens if I do that?’ Having these online systems with their sleek infographics and simplified information can be really helpful for survivors who just want to have a better idea of what happens when they seek different options.

Conversely, what would you like to see in these reporting softwares that is currently missing or absent?

Currently, many of these companies have a ‘B to B to C’ model — they’re a business selling to a business to serve customers. System developers sell their products to universities to serve students. There is a conflicted loyalty of sorts. To design a system that actually caters to the needs of students might compromise some of their allegiance to their contracted clients, or universities. And vice versa, if you really cater to what university administrators are looking for, you might not necessarily be serving student interest.

Because of their operating model, companies have this double loyalty that results in the kind of trade-offs we see in practice. For example, one system may practice progressive disclosure really well, but at the end, it nudges students to report to authorities — even though what got the student to use the system in the beginning might be that they didn’t necessarily have to report. Whose interests are being served here?

If I could wave a magic wand and change anything, it would be to really focus less on reporting as a solution for survivors. What I hear again and again and again from survivors is that they just want the ability to consider their options, and to do so in an autonomous and flexible way, so that they can explore what the best option is for them, whether that’s reporting it to the authorities, or seeking mental health resources.

There’s actually some exciting research coming out that suggests speaking to a non-human agent can contribute to survivors disclosing earlier on, and in more comprehensive detail. That might be a very helpful step forward in making sure survivors seek help at an earlier point. Many survivors take up to a year after the incident to disclose and they tend to do so to their families and friends and trusted confidantes first in progressive bits before they finally decide to seek what they perceive as formal help. So this might be one promising area in thinking about disclosure. Many survivors take up to a year after the incident to disclose and they tend to do so to their families and friends and trusted confidantes first in progressive bits before they finally decide to seek what they perceive as formal help.

What I haven’t really seen in any system, and I would also say policies, is really focusing on a more robust understanding of what help-seeking, or what justice is for the survivors beyond reporting. That’s what I’d really love to see.

This work was carried out as part of the Big Data for Development (BD4D) network supported by the International Development Research Centre, Ottawa, Canada. Bodies of Evidence is a joint venture between Point of View and the Centre for Internet and Society (CIS).