Some more thoughts on metrics

Following on from my last post on the possible use of metrics to measure online digital reputation, here are some more thoughts.

Andy Powell took me to task in the comments, arguing eloquently that the metric is so obviously flawed that it is positively harmful. I've been pondering this, and here are some reflections, both for and against. As the methodology is the property of the consultants, I don't know exactly how the algorithm works, so am guessing from the results.

I can see at least three possible problems with the methodology:

1) A deadly attractor – given the search term used ('distance learning'), the OU unsurprisingly dominates the space. I am guessing that close association with the OU then boosts other sites. So, for example, if I had this blog, but was a Professor at, the University of Glamorgan, say, then my ranking wouldn't be as high.
2) An echo chamber effect – a lot of the sites tend to reference each other (me, Tony, Brian, Grainne, for example). You then get a positive reinforcement effect. They claim to have adjusted for this, but I'm not convinced.
3) Bias in initial setup – the blurb seems to suggest that they find the influential sites by searching and analysis, but there must be some priming. The list is heavily UK-centric for example. This may be a result of the term used (see below), but until we know how each search is initiated I think we have to suspect some initial influence depending on the starting parameters.

Search term – as I mentioned in the original post, the search term is significant. 'Distance learning' is very niche – it's not 'online learning' or 'elearning'. Had it been then the list would have been very different. This (plus the things above) may account for some notable absences from the list – why no Stephen Downes for instance, who we would surely think of as the hub par excellence?

Results may be revealing – Andy argues that the presence of Brian Kelly demonstrates that the list is nonsense as Brian doesn't really blog about distance learning. But I think it may be telling us something interesting. It can't just be random, and why is Brian higher than Grainne, for example? It could be telling us that people who write about distance learning tend to reference Brian, even if he doesn't write about it directly. In this sense Brian does have 'influence'.  It could also show us that people who are writing about distance learning online are writing more about IT than pedagogy. This in itself is revealing isn't it?

It's not just popularity – popularity is a factor, and in some respects is a proxy for influence. But we all know that popularity can be gained for all sorts of unacademic things. So popularity is a factor, but only within a given context – it's not about the overall number of subscribers say, but the number of links relating to a given term, and its semantic cousins. So, if I was the expert in modern interpretation of Macbeth say, then I would expect to have the leading amount of links relating to this topic, even though the number of links would be small overall, because it is a niche subject. I wouldn't be a popular blogger relative to every other subject, but within this very specialised subject then I would be 'popular' relative to other subject sites.

Gaming – any algorithm is subject to gaming, as is any system. Exams and the REF are subject to gaming, but the key is to put in enough checks to make gaming difficult, detectable, and ultimately not worth the effort. Any such algorithm would need to be sophisticated enough to avoid obvious gaming.

A metric would only be a partial solution – I think we'd probably always want an element of peer-review in any analysis, and wouldn't rely solely on an automatic measure. But rather we could view an algorithm as part of a portfolio of evidence an individual might present.

We've got to start somewhere – my take on this is that the output may have problems, but it's a start. We could potentially develop a system focused on higher education, which is more nuanced and sophisticated than this. By analysing existing methodologies and determining problems with them (such as the three I've listed above) we could develop a better approach. I hold out hope that we can get interesting results from data analysis that reveals something about online scholarly activity.

3 Comments

  1. In reading this I kept thinking about two key methods in which digital space is ranked/evaluated … SEO/SEM and how one can design their digital space to maximise SEO/SEM. However at the end of the day it is how much utility the user/audience sees in the digital space that will drive reach/traffic etc.
    Interesting problems, but let’s compare your list of problems in evaluating ‘academic digital connectedness’ to the current RAE/REF system.
    1) A deadly attractor: Isn’t this a good thing? Being a Professor of distance ed and at the leading institution for distance education … so you should get a higher ranking, especially in a contextual evaluation of relevance.
    2. An echo chamber effect: How does this differ from academics who all research the same field citing each other because their work is relevant and contributory! And then, are involved in the same peer-review networks for that said area of research? So I don’t see a problem with cross-linking to relevant sites! I think relevance is key. If I am a leading professor of distance education and fellow scholars in your area are not linking to you, I’d ask why? Competitive rivalry perhaps, lack of perceived utility or perhaps a lack of awareness…??
    3. Bias in initial setup: I totally agree with you here that initial framing will influence the outputs. So the parameters for judgement sampling need to be made clear. Which parameters and why? However, this doesn’t much differ from how we evaluate journals for inclusion in the journal ranking’s list (e.g., distribution, geographic scope etc) or academics i the RAE. The sampling design just needs to be made very clear, as with any research.
    4. Search term: This activity of being narrow in scope doesn’t differ from how sub-panels in the RAE are established or journal list devised. Yes it is very siloed, which is why if you work in media and publish in IT the RAE sub-panel struggle to evaluate your contribution. So as you suggest, perhaps a list of search terms devised, ranked and agreed by a community (and not just the academic community) central to the field would be more robust.
    5. Popularity (or perhaps reach is better term): Perhaps here we can discuss the evidence of strong-tie and weak-tie sources in the network. In the peer-review system citations denote the popularity of a published work (or sales of books). So perhaps number of ‘relevant’ outbound/inbound links (behavioural measures) and key word/name citation (usefulness measure) would both be important here to assess ‘reach’ of impact.
    I’ll have a think about others measures that we use to evaluate the marketing effectiveness of the digital space, however agree … that a metric is only a partial solution … but in the least, it should be considered for inclusion for the REF – for peer-review.
    Impact is not just about being an editor of a journal, giving a public talk or being recognised by the academic community as a leading scholar … increasingly academic institutions are being called to to show the social contribution of their research outputs and how they engage with their many public spaces … the digital space is a very important space!
    Ignoring academic activities in the digital space discourages connectedness with a wider social network within which academia is embedded! Be it with students, colleagues or society at large.
    Smiles
    Kelly

Leave a Reply

Your email address will not be published. Required fields are marked *

css.php