this is totally gonna work… » Technology

Launch Day!!!

June 23rd, 2008

I’ve spilled a lot of (virtual) ink in this blog, but almost none of it about what I do all day. That’s because I’ve been working at a startup in “stealth mode” for darn near two years and haven’t been able to really say much about it. Until today.

Picture 4.png

Today at Evri we’re launching the beta version of our site. If you head to the home page and signup, you’ll get yourself a free set of brand-new shiny credentials that will give you the keys to data-surfing heaven.

homepage

The company blog post does a good job of highlighting what we have available, but for the truly lazy I’ll give you the quick highlights.

First, is the home page which gives you a look at entities and their relations as we understand them currently. We start out with lists ranking the top people, places and things. In addition to popularity, you can also see who is rising and falling in popularity over time. All of these lists are clickable and enable our super-whizzy widget which provides a nice way of pivoting between entities, all the while getting related content.

This is one of my favorite parts of the product, as it’s really easy to just get lost wandering around from link to link to see how things are related. It’s like a big six-degrees-of-Kevin-Bacon game, except that you can do with just about anything. Part of that link-hopping experience is visiting specific pages about each entity.

Here you can find more detailed content about an entity. Currently these details comes from Wikipedia, but we anticipate adding several other specific sources of structured content. And of course, these pages link you to other pages, so between the hub-and-spoke visualization and the detail pages you can spend quite a bit of time just data-grazing.

evri profile page for bon iver

So take it for a spin and have some fun exploring! Give us your feedback (a link is at the top of each page) and let us know what you like and don’t like. Most importantly, stay tuned. This is just the beginning for us, we have some pretty exciting stuff in the works and you won’t want to miss out.

Things have been a bit nutty the last few days–as they are for any release–but it will be worth it for the satisfaction of finally lighting this candle.

Enjoy!

S3 Sync Update

April 5th, 2008

A while back I wrote about my strategy for synchronizing between two machines using Amazon’s S3 and the JungleDisk tool. I just wanted to post a quick update that refines that strategy a bit. First, let me describe what needed improvement. I sync the ~/Documents directory between my home and work MacBooks. However, on my home machine I have some extra files that really don’t belong on my work machine (like Quicken files), so I have a small text file (called sync_files) that enumerates which sub-directories and file in ~/Documents are to be synchronized between the two machines.This all worked pretty well until I noticed duplicates of files appearing in different places. I realized that what had happened was that I had moved the files on one of the disks and then sync’d with S3. With my current scripts this resulted in copying the file to the new location, but not removing the old one.So with a quick glance at the rsync man-page, I found the --delete option. I refined my scripts and ran them. It all looked good–until I got home. Oops, I just lost a whole bunch of files. Uh-oh. It turns out I forgot to use the sync_files file for both directions. This was an easy tweak but reminded me of the Golden Rule of Rsync:


Always run rsync with --verbose and --dry-run to make sure it’s doing what you think it’s doing

So I decided it was time to re-write the script to support this. While you can do command-line options with bash, it quickly gets kinda oogy, so I fell back on Ruby instead. I’ve also collapsed the synchronizing down into a single script–one that goes both ways. So without further ado, you can download the script here. This should work with a stock Ruby install, no special gems required.

Update 4/8/2008: Okay, I still don’t know what the hell I’m doing. There is a bug with this script in that if you create a new file locally then try to sync from S3, your new file will get obliterated. Well guess what kids? Synchronization is hard. I’ve been noodling on a variety of hacks to get around this but none are terribly satisfying. Anyway, my Golden Rule (see above) still stands: make sure you test the thing out before you run it “live”.

Green Fields

March 4th, 2008

Ahhh…a fresh new server install. It’s like getting the first squeeze of toothpaste or the first scoop of peanut butter. It feels especially good because it’s all yours. It brims with potential and has no marks of anyone else’s will upon it. Setting a server for yourself is liberating because there are no constraints, no corporate policies to adhere to. Right or wrong, you get to call all the shots and take all the responsibility. You finally get to do what you always dremed of when that sneering, omnipotent system administrator show down all your ideas. Petty bureacracy will not stand in your way!

I’m waxing poetically because I just acquired my first Slicehost account for a side-project I’m working on. I’ve dozens upon dozens of Linux installs over the years, why should this be special? Perhaps it’s worth a trip down memory-lane first…

If you discount those early years writing BASIC for the Apple II and TRS-80, I’ve been living Unix/Linux longer than I’ve been programming. In that time I’ve gotten rather particular about my distros and how I like to configure them. I first cut my teeth on HP-UX working at a wireless telco. Some co-workers there introduced me to Linux which like having an equivalent HP-UX-like interface on a 486 under my desk at home.

Like many, my first distro was the tower of floppies known as Slackware. Ah, the good old days. When setting up X might have entailed destroying your monitor if you didn’t get the parameters right. When I started doing actual development I moved over to Red Hat because it was the most polished distro at the time. After that I had a brief flirtation with SuSE, but found the configuration frustrating although I did have a major success getting dial-up internet working via number of arcane Hayes modem commands and some network scripts I found on the web.

Then I discovered Mandrake, which I stuck with for a number of years. It was Red Hat-based, but had a much better installer and better package manager. In the end though, even the package management improvements could not overcome the inherent flaws of RPM. On countless occasions I would upgrade and the entire sweater of package dependencies would unravel and suddenly I need to upgrade to a new version of glibc just to get a decent RSS reader working in KDE. Ugh.

It was then that my most Linux-geeky friend turned me on to Gentoo. You will learn more about linux kernels and configuration running Gentoo than you ever imagined. Mind you, I didn’t get into Gentoo because I wanted to hyper-optimize my install of emacs. I switched because Gentoo’s package management and configuration beat the snot out of Mandrake and I found the ebuild system rather elegant.

Unfortunately Gentoo took a tool on my patience with Linux on the desktop. Gentoo required far more care and feeding that I could give it. I spent far more time building and re-building my system than actually doing anything with it. I will point out though, that doing a bare-metal Gentoo install all the way to a full-blown desktop manager like KDE or Gnome has to be about the best hardware burn-in test there is.

So then I hopped on the Ubuntu band-wagon. While there are far too many ways to install packages (apt-get? dpkg? adept? wirble?), the Debian package management is a nice compromise between pre-built package systems like RPM and configurability (a la Gentoo). So that’s what I’ve put on my new Slicehost host. It’s well-documented, has tremendous community support behind and is much more up to date than it’s older, conservative sibling, Debian.

So back to Slicehost. So far the experience has been tremendous. I would say that less than a minute elapsed between the time I decided to give them a credit card number and when I got a console login. They have several pre-built server images you can use to provision your slice. And now it’s mine…all mine. I get to set it up just the way I like.

So what’s in the soup, you may ask?

Web Server: nginx

I’ve done a lot of Apache over the years and it’s hard to overestimate the impact that it has had on the web. However, it’s configuration has become nightmarish. In this case my needs are pretty narrow, so I’ll go with something easier to work with. Plus it has a rad name.

App Server: Rails!

I hope I never have to do Java-based web development again. As far as dynamic language go, I’ve thrown my hat in with Ruby rather than Python, although I hear great things about Django and I’d like to play with it. There are certainly other viable Ruby web frameworks (Merb, Camping, …) but for this project I’m more interested in getting things done than learning a new web-stack.

Database Server: MySQL

I really really really wanted to do this with Postgres since it has just about the best command-line tool I’ve seen for any RDBMS. I also find the permission model in MySQL to be byzantine and difficult to debug. But alas, scaling out to multiple nodes with read/write master/slave and replication is much more do-able in MySQL than Postgres. There’s a ton of existing literature out there on it and I’ve had experience doing it.

Mail: Postfix

Eventually this app will need to support incoming mail so simple MTA’s like exim or esmtp weren’t going to cut it. Since I’m deathly afraid of Sendmail, I figured I’d go with Postfix. This is probably the tool I know the least about right now, but it should be interesting to learn.

Monitoring: monit or god

I’ve used other commercial and open-source monitoring tools and they’ve all felt really heavy-weight and invasive. I don’t so much mean “heavy-weight” in a CPU-sense, but in a process and procedure sense. Looking at the sample configurations for both god and monit put a smile on my face. god looks particularly interesting because it’s in Ruby.

Text Search: solr, ferret, sphinx

I’ve spent a little time looking at all of these solutions to supported inverted-index, full-text search support. I’m not 100% sure where (or if) this app will need full-text search so this area is a little bit hazy. I’m not committed to a particular implementation technology here (i.e. running Solr which is in Java is totally viable).

Message Queue: …..

I’m more sure that we’ll eventually need to have asynchronous, queue-based work processing to handle things like image resizing and storage or document generation. Like the text search, running something like a Java-based message queue (like ActiveMQ) is entirely possible. A major piece of the selection criteria will be based on ease of integration.

So we’ll see how this goes. I run my own Linux box at home to host this blog, a Subversion repo and couple of other things. It’s nice to have a virtual equivalent sitting on much fatter pipes. I’m excited to see where this goes.

High Resolution

February 2nd, 2008

IMG_0172.JPG

This week I tripped across Edward Tufte’s review of the iPhone interface. His focus in the video, as it is in his workshops and books, is on data density and high resolution. In short, when you have a high resolution surface, you should take advantage of it by providing high data density.

In this vein, I had a small epiphany yesterday around this very feature of the iPhone. I had a paired-programming interview with a candidate yesterday who needed ‘net access. At work, we have a public access-point for this very situation. However to get on the network requires credentials which were posted on the wall in a different room. So I walked over to the other meeting room and tried to take a mental snapshot of the information that was posted on the wall. Uh-oh, the password was all in l33t-speak and I was not going to be able to retain that in my short-term buffer on the walk back to the other room.

So I pulled my iPhone out of my pocket to write the SSID and password down. But then I thought to myself, “Wait a sec. Why go through that many layers of translation?” What I really wanted was to take the paper off the wall and back to the other room, but then I’d have to carefully pull the tape off the wall and then remember to put the paper back. Yuck. Too much effort.

So I took a picture. That’s it. So simple. The most direct translation of the information from one location to another. I snapped the photo, made the twenty-foot journey back to the interview room and just laid the phone down on the table. The interviewee was able to easily read the words off of the iPhones high-res screen and we were off and running.

Perhaps this is an indictment of the poor quality of your typical embedded camera. Perhaps it’s an indictment of our low expectations of that technology. We’ve simply come to expect crappy photos that, at best, outline an image but could never actually provide any detail. This was different. You could actually read the words clearly. It was such a simple thing to transport that information with a picture, but powerful all the same.

Daughters in Computing

July 4th, 2007

I’m a big podcast fan. There aren’t many I listen to, but the few I enjoy accompany me and my dog on a near-daily basis. One of my favorites is Geoffrey Grossenbach’s excellent Ruby On Rails Podcast. In particular the last two episodes, about women in computing, really got me thinking about education and technology. If you haven’t had a chance to check them out, they are worth listening to.

RailsConf 2007 had the best female turnout that I’ve observed by far of any conference I’ve attended—and sadly, it was still too little. I’m not talking about having gender diversity for the sake of diversity, but that the field of software is truly missing out on a lot of talented women who have been turned away for one reason or another. Why is this happening?

I believe that it’s a combination of education and culture that has made it this way. The curriculum that has formed around computers in schools is frankly appalling. By the time students leave high school very few have had the stomach to persevere through restricted access to machines, social stigma and bad teaching (no, learning MS Word does not count as computer education).

Culturally, computing reveres the lonely genius who springs some great idea on us. Social interaction through and with computers has always been awkward and fringey, despite the success of things like IM, Twitter or MySpace. Within the so-called “geek-culture”, independence is prized over group interaction. Frankly I think this turns a lot of girls off. The cultural fabric of computing has become so rooted in shielded individuals that it simply doesn’t occur to many girls that getting involved with computing is something they might enjoy. I hold out hope that various agile methods will be adopted more widely with their emphasis on collaboration.

So what can we do? I may get hate-posts for this but I’m not sure there is much we can do for the current high-school aged female population and beyond. Each person (male and female) has acclimated to gender norms in our culture and has enough experience that changing many attitudes would be difficult. I think we have to improve the curriculum, access and social attitudes at a much younger age.

We are obligated to remove the cult of priesthood, masculinity and mystery from computing and put these wonderful tools in front of kids as soon as we can. So here’s my proposal: since the field of computing is so male-dominated right now, fathers who work with computers owe it to their daughters to make extra-sure that they get exposed to computers. This means when you’re working at home on some deathly-important thing and your little girl wants to type on the keyboard you stop what you’re doing and let her type in a word-processor or text editor. This means that you let them play around with and click on things without telling them what to do. This means you don’t scare them out of interacting with it. This means when they ask questions you try to answer them the best way you can.

My four-year-old daughter likes to “type some letters” and “play the drums that I like” on my MacBook. This translates to saying and typing letters into TextMate and playing drums via the keyboard in GarageBand. I never say no. I always haul her up in my lap, close whatever else I’m doing and let her play, explore and ask questions. Yes, this once resulted in crashing the window manager on my Linux box, but that was okay. Just the simple tactile experience of touching the machine is important.

Do I hope she follows in dad’s footsteps? Not necessarily. I only hope that she’s as fortunate as I am to have discovered what I like to do and can make a career out of it. What I certainly don’t want is to have this avenue closed off to her as was the common experience of my wife and most of my female friends. It’s going to take some time and effort to get girls engaged. We (computer nerds) desperately need them if we’re to continue to innovate, learn and grow.

Getting Leverage

May 31st, 2007

Right after getting back from RailsConf I headed off to a family trip to Southern California. I only just got back this weekend and I feel like I haven’t quite had the time to fully digest the conference. As a result I’ve had a lot of odd thoughts rattling around in my head, most of which have failed to form into anything cogent. Well, except for one..

One of the important takeaways I got from the conference wasn’t specific to Ruby or Rails, but rather a larger concept of leverage. It occurred to me that the people and organizations that are really shaping our technological lives are doing so by a judicious application of force. Our world is simply to complex to blindly apply raw force to problems. Instead it seems to me that the thought-leaders are finding and using force-mulitpliers of various types to build the future. Three examples of this pop into my head. Whether or not they constitute the best examples is debatable, but I believe that they all share a common underlying principle.

The first is Ruby on Rails. I’m not here to sing the praises of Rails and tell you that you should do all your web dev in it. I happen to like Rails and think it’s got a lot going for it. But for the purposes of this post I want to talk about the choice by its creator in using the Ruby programming language as the foundation for Rails. For the Rails team, the choice to use Ruby was crucial to the success of the framework. Not everyone likes Ruby and that’s fine. But it is a high-leverage language. I’m not aware of anyone who has adopted its use and not become more productive.

The Rails team and David Heinemeier Hansson deserve a lot of credit for developing Rails, but a lot of that credit should go towards the decisions they made about the tools they built and their choice of technologies to rely on. Rails isn’t worth looking at because it’s an ingenious piece of engineering, Rails is worth looking at because it applies maximum force with minimal effort.

The second example is Google. It’s hard to imagine web-life without Google. But what makes Google so special is not simply that they apply sheer mass of brain-power to sheer mass of computing-power (though it doesn’t hurt). Instead it seems to me that Google has made the major effort up-front to build the world’s must powerful clustered system (i.e. their core search engine), then they have leveraged both the data and the infrastructure from that core for their other services. The things they are doing with Docs & Spreasheets or Google Gears wouldn’t be possible without leveraging that first asset and these types of products are clearly part of the next generation of services.

My final example is similar to Google, and that is Amazon Web Services. Much like Google, Amazon has built an impressive infrastructure to power their core retail business. At first glance AWS seems totally unrelated to that core business. But a second look reveals some pretty smart thinking.

What Amazon has on its hands is a massive amount of computing power setup across the globe. Amazon’s chief asset is the very infrastructure that its retail business is built on. AWS is simply the commercially-available version of that infrastructure. Whether Amazon has excess computing power that it’s selling or it has figured out how to grow that power is largely irrelevant to this discussion. I’m assuming that Bezos and his minions are smart enough to figure out a way to sell this service and not make it a loss-leader. The important point here is that AWS could be a very disruptive technology. The flexibility and scalability that AWS provides is something that simply doesn’t exist at a price-point where anyone can use it.

I don’t think Amazon’s retail business is going to shrivel and blow away, but I think the creation of AWS reveals some shrewd long-term planning for the overall health of the company. Clearly Amazon is attempting to diversify and it is doing so by taking advantage of its greatest assets. The server infrastructure Amazon is the force-multiplier behind AWS.

So while these three examples may seem a little disparate and scatter-brained I think they reflect an overall emerging trend in high-level thinking. Think of job interviews you may have had where someone wanted you to implement a merge-sort algorithm or write a database connection pool. I understand the mechanical intent behind these questions (does the candidate have basic technical skills?), but the answers that are typically sought miss a more important point. If I were to give that question to a prospective candidate I would hope they would answer with one line of code that calls a built-in sorting routine that comes with whatever platform we’re using. I simply don’t care that they can write a merge-sort. I can’t imagine working on a project where its success was dependent on us writing sorting algorithms. So I’m much more concerned with whether or not they know how to use it and what it means in context.

The sheer amount of information to master in our technical fields is daunting. Frederick Brooks acknowledges this in the 20th anniversary edition of “The Mythical Man Month” by lamenting the fact that he has simply had to stop following a number of developments in the field. We can’t keep worrying about bit-shifting and low-level algorithms and hope to make any progress. While there has to be somebody who worries about this stuff, we need a much larger population that leverages these tools to build something truly wonderful. In short, let’s find those things that let us get maximum force with minimal effort. For me right now, those things are tools like Ruby and my beloved MacBook Pro. As time goes on I’m sure I’ll new tools to supplant these. The important thing is to keep evolving and adapting to find that leverage.