Open research ethics – the puppy killer scenario


Yesterday I ran a workshop called "The Art of Guerrilla Research" for ELESIG, along with Tony Hirst. I'll blog it later but basically it was about what sort of research can you do without permission and funding, eg asking questions of open data (hence Tony describing the things he does).

One issue that was raised a few times was that of the ethics of it. The assumption has long been that anything openly available is fair game. So for instance there is a lot of research that uses travel blogs as its data source, and they don't require the permission of these people to analyse them or interpret them. In general, this is my stance too, but thinking through the types of things Tony does with data led me to come up with a scenario which would raise ethical issues. I offer it just as an example of how it isn't quite as clear cut as you may think regarding openly available data.

Let us imagine that there is a heinous crime we can all agree is very bad – puppy murders (I'm using a silly example so people won't get distracted by a specific crime, but you can replace puppy murders with a small or large crime/amoral act of your choice). Tony does a FOI request to find all the people convicted of puppy murders over the past decade. He then finds which of these have Facebook pages that are openly available. He creates an interest graph of their listed interests, and shows that puppy murderers tend to have a number of interests in common. He blogs this, just out of interest.

Someone else then comes along and finds all the people on Facebook who also have these interests, and publishes a list of 300 people who have 'puppy murderer' type interests. One of these, although entirely innocent of any puppy mistreatment, is attacked by a mob who accuse him of being a puppy murderer.

Now, this has used all openly available data, publicly and knowingly shared by the individuals. But by taking it and creating a new interpretation of that data, new knowledge has been generated which the original posters could not have foreseen. The new form of this knowledge then carries an ethical dimension. This is obviously an extreme example, but it illustrates the potential complexity of assuming all open data is fair game.


  • Kevin Ashley

    I don’t think this is an issue raised by open data at all. The ethical dilemma you describe arises whether or not Tony did his research using open data. All that open data does is make Tony’s research task quicker and cheaper. At the end, he – and the person who reuses his findings – still have questions to answer which are independent of their data sources.
    That’s not to say that there aren’t issues to consider when you have purpose-blind access to underlying data. It does mean that good and bad consequences can arise and there are those who feel strongly that we need to change things to allow the good and prevent the bad. My personal view favours openness, but perhaps because I have a greater tolerance of the varied outcomes that will arise.
    What needs to be guarded against is a false sort of openness that is skewed towards those with lots of resources. That tends to mean that large vested interests use open data to exploit us, but we as citizens don’t get the benefit.

  • Dominik Lukeš

    I have to agree with Kevin. This analogy does not apply to open data as a phenomenon distinct from others. In fact, it seems to me that knowledge from guerrilla research is subject to exactly the same scenarios as research from “normal” research. Or journalism, or a casual conversation about some bit of knowledge.
    It’s the judgement and situated action that matter. The knowledge itself is only a small part of it. I wrote about this in this very long post on “Epistemology as Ethics”

  • mweller

    Hi Kevin & Dominik – I can’t have made my case very clearly. The point I wanted to make was that there is often an assumption that ethics don’t apply when you’re using data that people have made available openly about themselves because “anyone could access it anyway”. But by adding a layer of interpretation in a way that the original party could not have predicted, that can (not always) bring an ethical question to bear. In that sense it becomes more like conventional research where you would consider the ethics of data you were gathering. Like you, I’m in favour of open approaches, and heaven knows wouldn’t want to make open/guerrilla research be as constrained as conventional research, but I was trying to demonstrate that just because you’re using pre-existing information, that may not make it an ethics free zone.

  • Philosopher1978

    Hi Martin – you are right that there is something significant here. I think you can phrase it in something like the following way: we we take openly available data and process in a way that new insights become possible. Arguably these are kind of Frankenstein data sets – created from other bits of data but not reducible to them. At present the tendency is to think only about the ethics of data release at the time of original dissemination but I can foresee a future when you will also have to think about the ethical implications of mashups, etc. in the same way and possibly even have to get panel approval for them. It’s the aggregation of lots of small pieces of (relatively innocuous) data that’s the issue. But it cuts both ways – it can be argued that the potential benefits of big data mean we have an obligation to share widely but at the same time the privacy of the individual/group can be compromised by the patterns that are revealed. The NHS Care Data issue is a good example of this.
    Luciano Floridi spoke about this issue at the BERA ethics workshop earlier this month. You can see my notes from the session at (see the section entitled ‘Big Data, Small Patterns, and Huge Ethical Issues’)

Leave a Reply

Your email address will not be published. Required fields are marked *