Let’s talk about Instapaper and their GDPR notice that has now been up for two months.
Yeah. It’s silly.
I do find value in a “read later” type of service, though admittedly in my case, “later” these days means months, and even several years is not uncommon. I’ve been using Instapaper for many years, starting when Marco Arment was running it as a paid service. But it’s had too many ownership changes lately, and I’m not sure what their business model is.
So when this GDPR notice hit, I finally decided to look for an alternative. There many available, but I decided to give Pinboard a try.
It’s a bit of an eccentric service, sure. There’s no buzz or hype around it. Its Twitter feed is full of politics that its main guy Maciej is pursuing lately. I’ve never gotten a response to a single support email. So from some perspective, it’s Internet abandonware. At the same time, it has a clear business model, it’s up and running and you could say feature-complete, and seemingly has been stable for years. Stability for years and lack of hype/hysteria is kind of what you want from a bookmark-hosting service.
So, I signed up for an account. I contacted Instapaper support who kindly gave me my bookmarks dump as CSV and HTML. Can I just import to Pinboard?
Pinboard does have an import feature, but it didn’t accept Instapaper’s CSV or HTML. There’s no documentation about what the import feature expects.
It does have an API though, so I decided to throw together a quick script in Python to just consume the HTML and pipe it to Pinboard through the API. Python is such a great compact language to doing this kind of thing. I considered also Swift and Node for a second, but I haven’t written Python in a while and just felt like doing it.
Here’s the script.
Couple of things to note about it.
- I did run into this issue and had to edit the Pinboard library a bit, as described in the bug.
- When you run into an error like required properties of bookmarks (title) missing, pinboard.py just throws an error and aborts. What I do at that point is manually edit the CSV, remove all the successful rows, fix the row in question, and re-run.
- Pinboard apparently doesn’t like multi-word folder names and imports them as multiple tags. I had few enough of those that I could just fix them manually.
- Date and time from Instapaper is not supported, since it wasn’t included in the CSV.