Part of my job at Skype is running the Skype Forums. I recently completed an interesting project, switching it to a new platform and completely remaking the sign-in system. It's also one of the most complex things I've ever done in this area, and also one of my longest projects ever. I thought I'd document the experience, both for my own future use and also for anyone else who's interested.
Since Skype was launched in 2003, we had a phpBB-based forum that we used for interaction with early users and gathering more general feedback. The forums have been a very useful channel for Skype and continue to be so. So we needed to continue to invest in this area to keep the forums useful.
phpBB worked fine for a few years, but already in early 2005 it became obvious that we weren't too happy with it. Here's why...
- security. We looked at phpBB-s code and there's a highly technical term for it: "spaghetti". Over the past few years, we had a lot of annoying and some really bad incidents with phpBB, forcing us to waste a lot of time patching, upgrading and locking down, which could otherwise have been spent doing something more useful.
- features. phpBB 2 lacked many features that became more and more needed, such as subforums and remembering unread topics across sessions. This is the main reason for the forums currently having a pretty "flat" and confusing stucture. All this has arguably been fixed in phpBB 3, but you could say for us it came "too little, too late".
- spam. This is a bit related to security. As the Skype Forums gained popularity, they were increasingly targeted by spambots and our mod team had to spend a lot of time killing and blocking spammers. Again, there are tools to fight it in phpBB, but why bother when you can have a platform that does it all for you. (To phpBB-s defence, most of spam has stopped by now not necessarily because we switched the platform, but because we made the sign-ins Skype Name-based, which could also have done with phpBB. But we had already made up our mind about the new platform, so there.)
Jean recently asked what's up with the forum users. Without digging into details, there were simply “spambot attacks” on the days with indicated peaks, some of who may or may not have posted. It could also be that they created a large number of users for possible future use or whatever.
Objectives and new platform selection
We had the following objectives for the “forum remake”.
- more security. Stop the security and spam problems for good.
- features. More features that would help us run the forum and keep the forum users happy, such as subforums and maintaining "unread" state across sessions.
- Skype Name sign-ins. To reduce maintenance overhead and make the picture more clear to users, base the forum sign-ins on Skype Name, similarly as we do with anything else we put out on skype.com.
- structure remake. I actually postponed this because the rule of thumb of grand remakes is "only do one thing at a time". We were already doing two, switching to new platform AND switching to Skype Name, and the structure remake would have added even more complexity. Since the structure remake (organizing things in subforums in a more sensible way) wasn't a dependency and connected to other objectives, it was quite easy to postpone.
So we set out scouting for a new platform somewhere in late 2005. We had the following criteria.
- good security track record. No software is secure, so rather than zero information about security which would have meant hiding and not been good, we wanted to see that the vendor had a "track record", meaning that it could have had some security bugs, but these would be disclosed publicly properly and followed up with a patch.
- nice and hackable code. We didn't like phpBB-s code and were looking for something that was actually nice. Cleanly done, yet hackable enough for us to be able to plug Skype Name sign-ins into it.
- commercial support. Sometimes you just want a support commitment from the vendor. I'm not sure what are the support options with phpBB, I'm sure there are companies who you can pay to take care of things for you, but we wanted the actual vendor to take responsibility for this. I'm not sure what's the "vendor" for phpBB.
- good convertor from phpBB. With thousands of threads and users, there was simply no way we could have done the migration manually. We needed a converter that we could test and know that it actually works.
- a recognized product. We wanted something that's already established there on the market and that our team is comfortable with, preferrably something that they've used before.
We ended up picking Invision Power Board. The main reason was quite simply that our mod team had worked with it. I also found that other companies in our space such as Six Apart are using it in their forums, which was good to know.
Identity switch and re-claim
The most interesting question for me was about what should we do with the old identities and content, once we switch over to Skype Names. Previously, we used the standard phpBB account system, so how to switch over to Skype Names?
One option I considered for a second was to start from scratch in the new forums, and either maintaining the old forums in a read-only state, or nuking them completely. But this wasn't really smart, because we would still have had to maintain two softwares and working across two forums is really awkward. Content nuking would have meant that we effectively destroy three years of knowledge and passion of thousands of people.
So I quickly disbanded this idea with horror and was left with a realization that I need to come up with some sort of mechanism to migrate over old content and users. Soon I figured out that even though it's a mechanism, it doesn't need to be automatic. There simply was no automatic way to make the mapping from forum to Skype Names: many people have different forum and Skype Names, and even though some people have specified a Skype Name in their forum profile, it's not validated in anyway and thus I could also put anyone else's Skype Name in my profile.
I came up with what you see now on the Skype Forums and documented in the announcement under the name of "identity re-claim". I migrated over all the old content but made the old accounts simply defunct and non-working. And all Skype Name sign-ins effectively create a new forum account. So if you were active also in the old forums, you end up with two accounts, one of them the defunct old forum account which you could see but not access, and the other your fresh new Skype Name-based one.
And here's the key part. If you care enough to have you old identity back, you simply ask for it. Works magic and in error-prone complex situations, a lot better than any automated system. And this is the only real "cost" of the switch for the forum user.
There were some quirks which we resolved during the training and testing phase. IPB has a requirement that all sign-in names (account names), display names and e-mail addresses have to be unique. Which is good. But it meant that we could end up with clashes between the old and new accounts. I invalidated all the old logins and emails by adding a prefix to them — a long arbitrary one, since the sign-in names and emails are not displayed publicly. Display names were a bit harder, since they are actually displayed, and a lot of people might want to use the same display name as before. I ended up simply adding an underscore (_) to them. So when you see names ending with underscore on Skype Forums, they are most likely the old migrated forum accounts that their owner has not returned to re-claim.
Training and go-live
I spent a good part of 2006 figuring out IPB and testing the migration process. Spending many months on it sounds overly long, but it was caught up at times due to some internal issues not relevant to discuss here, and I was dealing with it as sort of side project. We tested the migration and re-claim and everything else there could possibly be. Got questions answered. Played around with layout.
Among other things, we cought wound up because of database encoding trouble. We've forever been using UTF8 as the presentation layer, but for some reason, some of the databases internally still used Latin1 (ISO-8859-1). And when you try to move data from Latin1-encoded database to UTF8-encoded one, all hell breaks loose. So watch those encodings.
Finally, everything was ready for "zero hour", or the actual switch. We announced the downtime a few days before. Here was the checklist for the day…
- disable old forum
- run phpBB->IPB converter (it automatically clears old data from tables, so all test junk got nuked)
- run the code to invalidate old accounts
- switch the virtual hosts around so that new forum starts responding on forum.skype.com
- confirm all looks good
- reopen the new forum and confirm again that everything works
- normal operations resume. Hurrah. Start processing the reclaims.
I'm quite happy with how the actual switch went. All the training and testing paid off and the actual downtime was just around three hours.
Follow-up and actual identity reclaims and validation
We weren't really sure what would happen with the reclaims. Would we be overwhelmed? Would we get any requests at all? Would anyone understand what's going on? Did it actually work in a live setting? How about validating the identities?
It all has worked out fine, even though my Private Message inbox looked like this for the first few days…
I came up with three identifying criteria: IP address, e-mail registered with old forum account, or old forum password. At least one of them must be supplied, together with the old forum name and Skype Name, for me to make the identity re-claim and migrate the content. Looks like it worked, because all of these are non-public data and not retrievable from the old forum for the public. And most people are using the same account name in Skype and forum, so in these cases it's really simple.
One thing I'm happy about is that if some people present their old forum passwords, they are actually what I would consider as "secure" :) I haven't seen any which are as strong as "gaBqYBZ4" (random sequences of characters and digits), but none are as bad as "banana". They are either words with digits or multiple words concaternated in both upper- and lowercase, say "youareabanana44" or "M4ryh4d4l1ttlel4mb" (not anyone's actual passwords, I just made these up following the pattern). So based on this sample we had actually pretty good forum passwords, and hopefully their Skype passwords are even better :) then again, it could also be that only those people sent their password who had a good one ;)
The actual identity switch, once you've validated the identity, is as simple as running a few SQL clauses in IPB-s SQL toolbox. Here's what I got... I hope it covers all the content. (Let me know if I missed something.) Both old_id and new_id are the numeric ID-s in IPB-s user database.
update ibf_topics set starter_id=<new_id> where starter_id=<old_id> update ibf_topics set last_poster_id=<new_id> where last_poster_id=<old_id> update ibf_posts set author_id=<new_id> where author_id=<old_id> update ibf_attachments set attach_member_id=<new_id> where attach_member_id=<old_id> update ibf_message_text set msg_author_id=<new_id> where msg_author_id=<old_id> update ibf_message_topics set mt_from_id=<new_id> where mt_from_id=<old_id> update ibf_message_topics set mt_to_id=<new_id> where mt_to_id=<old_id> update ibf_message_topics set mt_owner_id=<new_id> where mt_owner_id=<old_id> update ibf_polls set starter_id=<new_id> where starter_id=<old_id> update ibf_voters set member_id=<new_id> where member_id=<old_id> select joined from ibf_members where id=<old_id> update ibf_members set joined=<result of last query> where id=<new_id>
One thing that initially happened was that the new forum performed worse than the old one. This was a bit confusing, since the server stats showed that it actually gave much less trouble to the server and load dropped considerably. With phpBB, sometimes at peak hours it would simply overload the machine and the forum became unresponsive, or "crashed". With IPB, it doesn't happen, but the loading speed was still slow.
IPB has a good SQL debugging mode that we used, only to find out that the SQL queries are just fine and don't take up nearly any server time. The SQL time was like 0.05 seconds per page, while the perceived page load time was a second or more. So it must be something in what we call the "web frontend" — things that happen after the server spits out the page and it travels through the caches and proxies to your browser who renders it and requests any additional objects.
Here are the comparative HTTP headers as issued by the "old" and "new" location. I'm not sure what component is critical here. Perhaps it's the Last-Modified header that doesn't cause the script to be re-loaded with each pageload, or the keep-alive connection that doesn't create a new HTTP connection for every .js (there are several .js files included per page).
There may be further speed improvements that we can do, but we've done the most critical part and the forums are again usable now.
The forums are not done yet and work continues. There are many comments in the new "Using the Skype Forums" subforum that we created specifically for this purpose, mainly around sign-ins (maintaining signed in state, signin in as invisible). We'll get to all these. Feel free to post more comments or questions there. There's another strange bug that sometimes you seem to not be able to maintain more than one signed-in session from each IP.
One major thing we will do is to re-skin these forums into something more fun and interesting. This will not be a big switch technically and will not change anything in content or signing in, but hopefully it will make the forums "look more like Skype", currently we're on a pretty standard skin. So stay tuned for that.Share