Blog of The SJG

Wednesday, March 29, 2006

OpenBSD's financial situation

In typical form, Theo is once again ranting about for-profit corporations not living up to his personal ethical standards. Wait, wait, wait.. WHAT? Right, exactly, let me spell it out for those of you that missed it. For-profit corporations typically do not, nor should they in general operate under the guidance of some high and mighty code of ethics. That's just not what they do! If you expect them to, you are a lunatic, profit margins are everything. Granted, there are some that do and kudos to them.

Theo, your pet OpenBSD is in some ways pushing the envelope to the extreme when compared to other open source operating systems. OpenSSH/SSL, bgpd, pf to name a few. You're a bright guy Theo, I know you are, get with your guys and figure out how to monetize those good bits. Selling CD's and DVD's isn't the way. Selling support services is a step in the right direction, but isn't going to get you too far. Offer custom development services, for customization, etc? Ok, ok, sure, not bad. But you can do better than that, it all depends on how far you want to take it. Oh wait, there's a thought. $10,000/year subscription service that gets your corporation fed security patches 72 hours before everyone else? How about a $30/year subscription service for individuals that gets you a login to download new releases 2 weeks before everyone else? Oh, right, but then one guy would grab it and setup a torrent or something. Think about it this way, though. At least then you'll have something legitimate to rant about.

Sunday, March 26, 2006

The Complete FreeBSD

One month ago, on February 26, 2006, I strong-armed Greg "grog" Lehey over IRC into letting me provide him an account on a machine to act as a download mirror for his book, The Complete FreeBSD. This while I was drafting the article submission to Slashdot, just in case. The announcement that the book was now available for free under the Creative Commons Share-Alike license was made several days prior and syndicated on a number of other more niche geek news sites. After the /. article went live on the 27'th, Greg decided to make use of the mirror, and switched the primary download site to Evilcode.net. Shortly thereafter traffic leveled off at around 7Mbit and slowly tapered off over the next few days. This from a post that did not even hit the main page.

Now, one month later, Evilcode.net is still acting as the primary download mirror for The Complete FreeBSD. By my rough count it has consumed so far in the neighborhood of 75GB of transfer, which breaks down as, roughly mind you... 7500 copies of the PDF version, 3400 copies of the PostScript version and 1000 copies of the book sources. Not too shabby!

I would like to extend my most sincere thanks to grog not for just disseminating this valuable resource openly as he has done, but also for its existence in paper form in the first place. I still have fond memories of receiving and reading my copy of the second edition some years ago when I was a FreeBSD novice. Not only that, but for years of valuable contributions to the FreeBSD project. The developer community would not be the same without you Greg, you are one of the good guys.

Direct link to The Complete FreeBSD page on lemis.com

Saturday, March 04, 2006

Receiving asynchronous notifications of database changes , part 2

As I trudge forward with this explanation please keep in mind that Epidemic is a proof of concept. I'm not saying that it wouldn't work fine in production as-is, but odds are it will bring down your entire infrastructure AND club all of your pet baby seals.

Let's quickly step through the code, starting with a simple entry point mail_server.py. After pulling in the required classes and whatnot, the first thing this piece of code does is instance SQLNotifyDispatch which inherits SQLNotify. These two classes form the core of the code that does all the heavy lifting we wanted to avoid in our frontend. It cannot even fathom what to do, however, without a little help. Rather than telling it what tables to watch, and also implementing code to take action when something happens to a watched table, the framework has been designed to allow one to simply implement the code that takes action, and let it inform SQLNotifyDispatch what exactly needs to be watched.

This brings us to mail/server.py, in which I will focus on the mail_users class. This class performs actions when modifications are made to the mail_users table. The classname does not necessarily need to mimic the table name, as can be seen if you look back at mail_server.py. First, the mail_users class is instanced, and then it is registered with the dispatch object we created before. You will notice the first argument to the Register method is 'mail_users', this is where the name of the table to watch is defined. Back to the mail_users class, now that we've let the Dispatch end of things know what table to watch, we need to let it know what columns we are interested in. This is where the GetCol method of the registered class comes in. As you can see its implementation is very simple, return ['gid', 'dba', 'username', 'quota_bytes']. This tells Dispatch that we are interested in changes to those 4 columns, changes to other columns in the table are fantastic, but we don't need to be notified of them.

The real meat is contained in backend/sql.py Digging down through SQLNotifyDispatch and into SQLNotify, you will notice a number of static definitions in the SQLNotify class, such as TableCreate, Function, Trigger and Rule. This is the heart of the whole operation. When SQLNotify is told the table and columns to watch, it crafts a table to log changes to, and a function/trigger/rule trio specific to the table being watched that serves two purposes. The first, it ensures that any operations happening on the watched table get stuffed into the log table. Second, it fires off PostgreSQL's NOTIFY, to let us (SQLNotify) know that a change has happened. That's right, it lets "US" know, because once this is all registered with the database it is out of our hands. This means that SQLNotify only has to make these changes the first time it watches a database. It also means, and much more importantly, that no matter what happens to "US", the application which setup the watching, and is acting on changes, no changes will be lost. The application could crash, no matter, as soon as we come back up, we can poke into the log tables and see what has happened while we were away. It is possible in some cases to receive notification of changes a bit later than one would like, but you will always, always know if changes were made.

When a change does happen, PostgreSQL fires off NOTIFY as dictated by a rule the SQLNotify class created for us. As a result of this our application is asynchronously notified of the change (that's right, no polling!). When SQLNotifyDispatch receives one of these notifications, it calls the appropriate method in our worker class, INSERT(), UPDATE() or DELETE(), dependant upon what the operation was that happened in the database. As you can see in mail/server.py, those three methods do various operations, such as creating directories on the filesystem, setting up or changing quota's, or deleting directories.

It is all a just a bit more complicated than what is typically done when this type of functionality is needed, I admit. My personal experience dictates, however, that this type of approach actually ends up being much simpler and easier to maintain down the road when one starts to grow an infrastructure.

Those who are tied to MySQL are not prevented from using a nearly identical solution. With the introduction of MySQL 5.0, it is all very possible save the asynchronous notification features. While elegant and conducive to good performance, are certainly not required. Keep in mind that polling log tables which are consistently pruned is in general going to be faster than checking a datestamp or even an indexed "updated" boolean column on a very large table.