Progressive summarization of audio & video to retain more of what you hear in podcasts & watch in online lectures.

I read a lot, and in a lot of different places. Sometimes I’m just reading for fun, but when I’m reading something that I want to remember and be able to share with others or apply in my own life, I have found annotation and progressive summarization to be effective approaches. These approaches generally require text, but with the addition of a few services that mostly play nicely together, you can extend this approach to audio and video.


Accounts at Otter, Readwise, Hypothesis, and Roam.
The Hypothesis toolbar in your browser of choice (I like Firefox).

The Process

Let’s say you’re watching a lecture. Instead of trying to scribble notes in a notebook that you’ll later have to transcribe into Roam, you open Otter and let it start creating a text transcript. Take pictures or screenshots as you go, because Otter will be able to place those in the transcript according to timestamp. When it comes time to review, you use Hypothesis to annotate the Otter transcript, then you write some notes in Roam summarizing the insights from the lecture. If you’ve connected your Hypothesis account to Readwise, your highlights will be occasionally re-surfaced for you to review, which is a key step in making them actionable. There’s also a way to get spaced repetition in Roam.

The Setup

At Readwise, you have a bunch of options for connecting highlights. Enable Hypothesis and it will pull in all your highlights from Hypothesis, including the ones you’ve made on the Otter transcripts. You’ll find them under the Your Articles section at Readwise. You can review there and write up summaries in Roam, linking to other concepts and notes.

Why It Works

It works for me because I use Readwise as a sort of catch-all bucket for all the stuff I already read in so many places – Kindle, Twitter, and all the stuff I find via Twitter and shove into Pocket – and now I can also use Otter to convert things I listen to or watch into a form that Readwise can catch & periodically re-surface for me.


  1. When you select ‘view in article’ at Readwise, it will take you to Hypothesis, not Otter. Otter can generate sharing links to annotate or you can export the transcript and annotate it somewhere else that’s publicly accessible, which is probably the best course so you have your own backup.
  2. Making extensive use of all these services costs a little money. Readwise is a couple bucks a month, and Otter costs a little bit if you go over their free minutes. Roam likewise has a subscription plan. I personally believe that if you are going to invest a lot of time and effort into building a personal knowledge management system, you’re going to want that system to stick around and get better, so you’re going to hope they charge enough to do so, but I know even a couple bucks a month can be hard to come up with on a grad student budget, so here’s some options. Most YouTube videos have a transcript generated by Google, which may be of higher quality and won’t use up your Otter minutes. Also, Docdrop is a service from the founder of Hypothesis that facilitates annotation of all sorts of document types and can accept Youtube links.
  3. These services are all relatively new. There is a possibility that they go under or get bought by a company with a different privacy policy. Carefully inspect the privacy policies of all the services you use, consider not using services that don’t let you delete or get your content out easily (Evernote, for example), and keep your own backups. I will note that services getting acquired is not necessarily a bad thing. My company, Mendeley, was acquired by Elsevier 7-ish years ago and it’s still going strong. Also, services that charge money tend not to be as intrusive to your privacy.

11. September 2020 by Mr. Gunn
Categories: knowledge management | Tags: , , , , , , , | 2 comments

How to create a Twitter list using the command line Twitter client, t, on Windows.

Hashtags are nice, but I wanted to be able to dynamically create and delete Twitter lists as well & I found the t Ruby client which allowed me to do this. Here’s how to get it going from a blank slate.

First, install Ruby with RubyInstaller. You’ll also need the DevKit.
Install the Ruby gem with gem install t. Then, register a application with Twitter. Next, authenticate the client with t authorize, which should take you to Twitter and ask you if you want to let the app you created access your account. You might have some issues with silly stuff getting the authentication to work, like outdated keys or whatever.

Then once you get to where you can tweet from your account, you’re ready to go. Create a new list with t list create [name of list]. The neat thing about having a command line client is that all the pipes and redirects and stuff work, so you can do cat listofhandles.txt | xargs t list add [nameoflist]. I think you have to have cygwin installed for xargs and cat to work, but I guess you could just do a batch file with a loop if you didn’t care about looking cool.

03. September 2014 by Mr. Gunn
Categories: Uncategorized | Tags: , | Leave a comment

Cities make people unhappy

Glaeser, Gottlieb, and Ziv have a economics working paper in NBER (not Open Access, unfortunately) in which they report their assessment of the happiness of various cities across the US. Vox has a nice map which makes the point pretty well, but I had to grab the data and take a look for myself.

I got population data from the US Census website and, happily, the happiness study used the same metropolitan statistical area names as the census does, so it was a simple merge to visualize population vs. happiness.
adjusted happiness vs. population
Continue Reading →

05. August 2014 by Mr. Gunn
Categories: Uncategorized | 1 comment

I’ve come a long way.

When I started graduate school, the only thing I knew about publishing was how to write a blog post, and the only thing I knew about my library was that I hated their website. I didn’t know what open access was, and if it wasn’t in Pubmed, it pretty much didn’t exist for me. All I wanted to do was do my research and work in the lab. Back in 2004, I started work on my first paper and was exposed to the academic publishing process for the first time. For someone who was already familiar with blogging, the whole process made no sense to me. If I wanted to cite a fellow blogger, I could just link to their post with a short little a href=””. I could anchor my link to a bit of text in my post, and they’d even get notified that I had linked to them. Likewise, I could subscribe to the RSS feed of their blog and get updates whenever they published. It was easy to see who was reading your stuff because Google Analytics was free (and even before that, there were plenty of log parsers). Why, then, would a group of people, among the smartest in the world, communicating potentially life-saving or economy altering information, use a system that was so inferior to that which people used to post pictures of what they had for dinner? Well, I was the only person in my lab, possibly the whole med school, who blogged, so no one understood what I was complaining about. I eventually found some colleagues online who felt similarly and we’d talk about why academic paper search sucked so bad, why reference management sucked so bad, and occasionally someone would build a new tool, which no one would use. The failure was usually chalked up to not having access to enough data by the developers and, if it required a critical mass of users, it was considered dead because academics wouldn’t take time away from research or writing papers to use it, because they had no incentive. So to me, the reason I had to use wonky, clunky, ugly tools and endure a long, tedious process to get published was down to these two things: lack of open data and impact factor chasing. As I dug into this, which helped me to procrastinate writing my qualifying exam, I learned that the lack of open data was primarily because academic publishing was mostly a for-profit endeavor and the entrenched interests had no desire to loosen their grip on their data. This was in the thick of the music industry’s “sue ’em all, let God sort ’em out” business strategy and newspapers were just starting to get worried, so the idea that you could do better providing a service instead of selling a product wasn’t really on people’s minds that much. Likewise, publishers didn’t do much to discourage impact factor chasing by scientists and there was literally nothing being done on measures of impact beyond citations. It seemed clear to me, then, that if I wanted to be freed from my drudgery, what I needed to do was to get more open metadata and try to establish something that could free research from the tyranny of the impact factor. I plodded along for a few more years, publishing a few more papers and supporting every new tool that arose which I thought had a reasonable chance of success, provided it could result in more open metadata, open access, and impact metrics. By the time I was done with my PhD, I knew that an academic career wasn’t in the cards for me.

I joined a biotech startup in San Diego later that year, early 2008, but later that year the company fell on hard times, along with the rest of the country. By early 2009 my time with the company was nearing an end, but I had still been following what was happening online and had begun to advocate for a new startup that seemed like it had a better idea than the boring old “social-network-for-scientists” clones that were popping up everywhere. As my involvement with the biotech tapered off, I was able to increase my involvement with Mendeley, eventually becoming part of the full-time US staff.

When I began to work for Mendeley, I was quite definitely aware of the possibility that they would get bought at some point. Nonetheless, I was excited to be able to play a role in helping them become a success. At the time my thought process was pretty simple: they were a non-ugly version of Endnote which also happens to be building a collection of research metadata that they can make available under an open license, and they can provide a measure of impact that is distinct from citations. Freedom at last!
Now 4 years later, there’s a lot I have to be pleased about when I look back. I presented one of the few for-profit business use cases for open access(PDF) to the US Office of Science and Technology Policy. We have ~90M documents available via an API with a permissive CC-BY license. We’re one of the leading contributors of data to the growing #altmetrics movement.

Now my career is entering another phase. I’m going to leave all the “we’re so excited” stuff for the official announcement, but I think Mendeley has gotten to a size where it’s no longer a startup, and smart people are predicting open access will be a reality soon. As Victor notes, we could have carried on, but it would have taken longer for us to get to where we needed to be and there’s no guarantee we would have made it. Springer + Papers or Nature + Readcube could put more marketing muscle behind their apps and neither of them have as open a philosophy as we do. What about Zotero? I think if Zotero was going to change things, they would already have done so, but maybe they could team up with the Digital Public Library of America or Center for Open Science.

I do think there’s a possibility that we could do some good as part of Elsevier. Having talked with tons of people, from the CEO of Elsevier on down, I am now convinced that they want to be a part of the changes, instead of trying to fight them off like the recording companies did. There are and will be a couple competing narratives: They bought us to bury us, we got paid tons of money so we said, “Fuck Open Access”, etc. This is going to be put in the context of Google Reader shutting down, Delicious “sunsetting”, etc. However, I’m not personally getting a pile of money from all this, and I never would have stayed unless I was convinced that they legitimately want to be part of the change to an open access publishing system.

So I’ll be staying with Mendeley. I have been told that my day to day job will remain the same, and that my voice is valued. I trust my friends to keep me honest and to call out bullshit when they see it. I’m grateful to have had this opportunity, over the past 9 years, to not only be a voice for a better way of doing and communicating research, but to be a pair of hands. I’ll learn everything I can about working within Elsevier and, after a couple years, if we don’t finally end up with freely available academic paper metadata and more Google Analytics-like research impact information, it won’t be because I didn’t try my best. That’s my promise and I expect – need – anyone who’s reading this to hold me to it.

Other posts:
Jason Hoyt:

09. April 2013 by Mr. Gunn
Categories: science | Tags: , , , , | 2 comments

How to root & install a custom ROM on the AT&T Nexus S running Android OS 2.3.4 (Windows or Ubuntu 11.04)

I grabbed one of the free Nexus S phones that Best Buy was giving away, and since for my last phone I was too far into using it before I thought about rooting, I wanted to be sure to start this one off right, so I literally rooted before I even put the sim card in. This was my first time rooting, so I still had to synthesize the method from a bunch of sources, filter out the sketchy sounding “download my super cool ROM from my .ru server and you’ll get 8 times the battery life” posts, and fill in the gaps with educated guesses. The actual process itself is really simple, looking back at it. This basic process will probably work for most Android phones if you get the right recovery image and ROM for your phone. Continue Reading →

11. August 2011 by Mr. Gunn
Categories: Uncategorized | Tags: , , , , , | 1 comment

Criticize tag clouds if you must, but this does give you a good summary of my research at a glance

Wordle: William Gunn's Research

This is kinda cool, too: I added the biggest 10 words from the wordle as page tags, and the Mendeley “Related Research” plugin in the left sidebar pulled in the major papers which influenced my work. This isn’t an entirely unexpected result, but kinda cool when things work like they’re supposed to.

17. February 2011 by Mr. Gunn
Categories: Uncategorized | Tags: , , , , , , , , , | Leave a comment

Real innovation in scientific publishing

Many attempts have been made to re-imagine a scientific article, but just adding semantic markup or visualizing the document in a different way has never quite felt right. Previous efforts have felt like they’re just trying to prop up a print idiom whose usefulness is limited in the new medium of the web. Cameron Neylon has come up with a re-imagining that’s truly useful and truly innovative. The idea is to break down a publication into its component parts, so that the smallest unit of publication is no longer a document. This allows publication to move beyond the limitations of the print era and enables the info overload management practices that work best online to be applied to research output.

For me, a paper is an aggregation of objects. It contains, text, divided up into sections, often with references to other pieces of work. Some of these references are internal, to figures and tables, which are representations of data in some form or another. The paper world of journals has led us to think about these as images but a much better mental model for figures on the web is of an embedded object, perhaps a visualisation from a service like Many Eyes, Swivel, and Tableau Public. Why is this better? It is better because it maps more effectively onto what we want to do with the figure. We want to use it to absorb the data it represents, and to do this we might want to zoom, pan, re-colour, or re-draw the data. But we want to know if we do this that we are using the same underlying data, so the data needs a home, an address somewhere on the web, perhaps with the journal, or perhaps somewhere else entirely, that we can refer to with, Science in the Open » Blog Archive » The future of research communication is aggregation, May 2010

10. May 2010 by Mr. Gunn
Categories: Uncategorized | Tags: , | 2 comments

← Older posts