Is Google the general-purpose Napster?

Napster has become an icon in the war between entertainment industry giants on one side; and fans, legal activists and computer programmers on the other. The Recording Industry Association of America (RIAA) technically “won” that battle, but they seem to be losing the ongoing war. And the Motion Picture Association of America (MPAA) is intensely interested in that outcome.

It’s hard to discuss the issue though, because there are so many players involved. Because they’re actually fighting for or against different things. And frequently they’re using arguments that have more to do with politics than with their actual goals. For instance:

  • Fans talk about sharing and mix tapes, while admittedly lots of them just like free stuff.
  • Legal activists talk about first amendment and fourth amendment rights and fair use, but often in the larger context of resisting the corporatization of government.
  • Computer programmers talk about non-infringing uses of technology and the technical impossibility of implementing certain regulations, while often echoing the legal activist’s resistance to corporatization.
  • The recording industry talks about piracy, theft, and the rights of the artists, while their real concern is control of the marketing and distribution channels.

This last point is the one I want to look at.

The official position

The RIAA has long held that any duplication of an audio recording made without the express permission of the copyright holder is, and should be, illegal. This is not technically correct.

There is, at a minimum, the fair use doctrine, which holds that works can be copied and reproduced for the purposes of comment, news reporting, teaching, scholarship, and research. So the RIAA is already stretching the truth, although they have been fairly direct in their efforts to limit the scope of fair use and restrict its application, but so far these exemptions exist.

Based on creative accounting

Have they stretched any other truths? Courtney Love and Janis Ian both argue pretty persuasively that protecting the artists is not what this is all about. Even a million-selling album barely makes the artists any money, while various industry segments take all the profit.

But repeated studies, some of which Ian mentioned, have shown that Napster users actually spent more money on new CDs. So why would the RIAA oppose it? For that you have to look at what music Napster users were buying. It was more likely to be back catalog or from a minor label. And the RIAA is designed to promote blockbuster releases from Top 40 artists.

So killing Napster was really about restricting the number of choices listeners have, and making sure the record labels own all of the remaining choices.

It’s not the music, it’s the information

Seen in this light, it’s more clear that the RIAA is not in the music business per se. They are in the marketing and distribution business. And Napster performed both of these functions for free. And did it better than the RIAA could, because it directly reflected the preferences of the users. It’s no wonder the RIAA attacked them.

So here is the iconic battle: The status quo industry controls information and access. The upstart provides information from the users, to the users, without industry input. Industry uses legal means to squash the upstart. But the genie is out of the bottle.

Who does information now?

In the generic terms I just used, you can see echoes of this battle in online search and advertising. Google, through their PageRank algorithm, represents the preferences of their users. Marketers coming from the broadcast model, where money equals attention, claim a lack of “fairness” in the system, saying that the content producers should control how their sites appear in indexes.

The battle is playing out differently this time, for a few key reasons.

First, Google got very big, very fast. By the time traditional marketers realized what was happening Google was already dominant.

Second, traditional marketing was fragmented and highly competitive. They didn’t have a single trade group dictating the market the way the RIAA was controlling the bulk of the music industry.

Third, Google had a business model before they came under attack. Which means they had money for legal defense.

Fourth, and possibly most important, Google knew the fight was about information and access. They weren’t distracted by arguments about artists and creators.

Which side are you on?

“You’re not promoting the right thing … Stop listening to what users say they like, I’m the one paying money, I’ll tell you what people should like … It’s not fair, I’ve spent all this time and money on market research and your system is saying that my product still isn’t popular …”

It sounds like the same arguments the RIAA made. Companies with a vested business interest are unhappy that people don’t like their product, and they’re trying to shoot the messenger.

Where are you in this fight? Insisting that Google should list you because you paid more? Or analyzing what PageRank says about how people view your content, and giving users what they want?

How To Triple Your Output By Cutting Your Output In Half

The fastest typist I’ve ever seen was a guy I used to work with. When he started going it sounded like a machine gun. Our standard line when people asked the rest of us about him was, “Yeah, he types over 150 words per minute … and about 40 of them are spelled right.”

Even after you ran his stuff through a spell checker, you’d have to proof-read very carefully to catch the places where “there” and “their” were mixed up. Where single letters showed up at random because the spell checker skips one-letter “words”. Where he left out words or just plain didn’t make any sense.

It usually took longer to fix his stuff that it would have taken to do it yourself from scratch. So why did we put up with it? Well, he was the boss. Keen.

But like anything else in life, you can at least learn something from the situation. What I learned from him was that it doesn’t matter how much you produce if no one wants it. Or put another way: Anything you do for someone else that isn’t up to their standards doesn’t count.

For an example of an entire industry that’s working as hard as possible to ignore this simple truth, compare television today to what it looked like in the 1950s.

The golden age

Back then there were three networks. How many prime-time shows did they collectively produce per week? Counting 8-11 p.m. Monday through Friday, that’s three 1-hour slots x five days x three networks = 45 hours of programming. Lets say a third of those hours were broken up into half-hour shows, so a total of 60 shows per week.

Let’s assume that to have a consistently great show you needed six extremely talented writers and actors. (Yes, there’s a lot more to it. This is just an example.) The fewer good people you have, the less often your show will be good. To fill 60 shows, you might have the following mix:

Show quality Talented staff
per show
# of shows Total
talented staff
Consistently great 6 1 6
Excellent 5 2 10
Very good 4 2 8
Good 3 18 54
Sometimes good 2 35 70
Generally poor 1 2 2
Total 60 150

Obviously every network would like their shows to be better. But for whatever reasons there are only 150 people with the level of talent needed to produce a weekly show.

Double the output

Fast forward a couple of decades. Now there are six networks. The talent requirements to produce a good show are the same, but there aren’t suddenly twice as many talented people available to do the work. The chart above might now look a little something like this:

Show quality Talented staff
per show
# of shows Total
talented staff
Consistently great 6 1 6
Excellent 5 1 5
Very good 4 1 4
Good 3 6 18
Sometimes good 2 21 42
Generally poor 1 75 75
Unwatchable – bad 0 15 0
Total 120 150

Notice how many shows had to drop into the lower categories to make this work. The contrast is really obvious when you look at the two outcomes side-by-side.

writer-chart

With twice as many total shows, there are fewer shows that are even sometimes good. By doubling the total output, there are fewer than half as many shows that are ever acceptable to the audience. And there are only a third as many shows that are usually good.

These numbers are obviously a gross oversimplification, but they illustrate a point: By increasing the output without increasing a limited, but required resource the overall quality declines faster than the total output increases.

500 channels and nothing’s on

Now fast-forward to today. There are literally hundreds of networks trying to fill 24 hours every day. Sure, with the amount of money available in the industry today there may be more people willing to work there.

But if my assumption is at all close to reality, then we would expect to see shows that clearly don’t have anyone talented working on them. We might even see shows where they simply send a camera crew out to film people without a script. We might call this a “reality” show.

So you’re not in the TV business. How does this apply to you? Everywhere in the example above, replace “talented staff” with “attention”. How much undivided attention do you have each day? How many ways are you dividing it? By trying to do more things, are you doing fewer things well?

Why no one wants software: a case study

No one wants software.

Really, no one.

What they want is documents … pictures … loan applications … insurance claims … Software is just another tool they can use to maybe produce more of the things they really want, better or cheaper.

What this means to legions of unhappy, cynical programmers is that no one cares about the quality of the code. Nope. They don’t. And odds are, they shouldn’t.

Here’s a little story to illustrate why. (By the way, this is the kind of thing you’ll see in an MBA course. If you don’t already do this kind of thinking, you should stop telling yourself there’s no value in getting an MBA.)

The pitch

I’m in charge at an insurance company. I have a manual, paper-based process that requires 100 people working full time to process all the claims.

Someone comes in and offers to build me a system to automate much of the process. He projects the new system could reduce headcount by half. It will take six months for a team of four people to build.

If you’re a programmer, and you think this probably sounds like a winner, try looking at the real numbers.

The direct cost

100 claims processors working at $8/hour. $1.6M per year in salaries. (Let’s leave benefits out of it. The insurance company probably does.)

Four people on the project:

  • architect/dev lead, $100/hour
  • junior dev, $60/hour
  • DBA, $80/hour
  • analyst/UI designer, $75/hour

Total $325/hour, or about $325k for six months’ work.

Still sounds like a winner, right? $325k for an $800k/year savings!

Except the savings doesn’t start for six months. So my first year savings are at best $400k. Damn, now it’s barely breaking even in the first year. That’s OK though, it’ll start paying off nicely in year two.

The hidden costs

Oh wait, then I need to include training costs for the new system. Let’s figure four weeks of training before processors are back to their current efficiency. Maybe a short-term 20% bump in headcount through a temp agency to maintain current throughput during the conversion and training. Add the agency cut and you’re paying $15/hour for the temps. 20 temps x $15/hour x 40 hours x 4 weeks = $48k one-time cost. Now my first-year cost is up to $373k.

And don’t forget to add the cost of hiring a trainer. Say two weeks to create the training materials plus the four weeks of on-site training. Since this is a high-skill, short-term gig (possibly with travel) I’ll be paying probably $150/hour or more. $36k for the trainer.

So if everything goes perfectly, I’ll be paying $409k in the first year. And actually, I don’t get even the $400k savings. I can’t start cutting headcount until efficiency actually doubles. Generously assume that will be three months after the training finishes. Now I’ve got three months of gradually declining headcount, and only two months of full headcount reduction. Maybe $200k in reduced salary.

Of course you need to add a percentage for profit for the development company. Let’s go with 30%. So …

The balance sheet

Software $325k + 30% = $422.5k
Trainer $36k
Training (temps) $48k
Total Y1 cost $506.5k
Projected Y1 savings $200k
Shortfall $306.5k
Y2 savings $64k/month

The project breaks even near the end of the fifth month of year 2. And that’s if NOTHING GOES WRONG! The code works on time, it does exactly what it’s supposed to, I don’t lose all my senior processors as they see the layoffs starting, etc. etc. etc.

The other pitch

Then a lone consultant comes in and offers to build me a little Access database app. A simple data-entry form to replace the paper version, a printable claim form, and a couple quick management reports. Two months’ work, and I’ll see a 10% headcount reduction. The consultant will do the training, which will only take a week because the new app will duplicate their current workflow.

Software $200/hour x 8 weeks = $64k
Training $200/hour x 1 week = $8k
Total Y1 cost $72k
Savings $12.8k/month (starting in the fourth month)

The project breaks even six months after the project is done, so early in the ninth month of Y1. Since the scope was much less ambitious, the risk is also lower.

The obvious choice

Which sales pitch do you think I will go with? Does that mean I don’t respect “proper” software development practices? And, the bottom line: should I spend more money on the “better” solution? And why?

The Digital Dark Ages

I’ve been paying my mortgage for about three years now. Unless I change something, I’m going to keep paying on it for another 27 years. I try not to think about the fact that although I have an actual physical copy of the mortgage agreement, with real pen-and-ink signatures, I don’t have any proof that I’ve ever made a payment.

At the risk of sounding like a Luddite, it bothers me that I have to trust the bank’s computer system to keep track of all 360 payments I’ll have made by the time it’s over. I’m not just being paranoid. I had an issue where a bank said my wife still owed money on a loan we had paid off three years earlier. We didn’t have anything in writing for each payment. The bank couldn’t even tell us the history of the loan; just that the computer showed we still owed money. And if a bank says you owe money, unless your lawyers are bigger than their lawyers, then you owe them money.

If you go to museums, you’ll see ledgers from banks in the 1800s and earlier. Over two hundred years later and we still know who paid their bills and when. But five years in the past … it doesn’t exist.

This could change with new regulations and retention requirements. But the big difference is what is standard vs. what you have to work at. A hundred years ago everything was written down. If you wanted to get rid of records you had to make an effort to identify what you wanted to delete, somehow separate it from the rest, and physically destroy it. Today, we only keep data as long as we have to. We only bother with long-term storage when the law or financial necessity makes us.

Let’s assume we have some data that we really want to keep “forever”. What is that going to take?

First, you’ll want to store it on something that doesn’t degrade quickly. Burning it to a CD or DVD seems to offer better longevity than VHS. Well, maybe. Second, you want to store it in a format that you’ll be able to read when you want to. This might be a harder problem than the physical longevity, when you start to consider how much data goes into a modern file format.

Look at the problem from the user’s perspective: The document format (the same applies to music and video) is just a way of saving the document in a way that it can be opened and look the same way at a later time, maybe on the same computer maybe not. When Windows 97 handles table formatting and text reflow around images a certain way for instance, the document format has a way of capturing the choices the user made.

If I open that Word 97 document in Word 2003, either the tables, text and images look the same or they don’t. If they look the same, it’s because there’s an import filter that understands what the old format means, and Word 2003 has a way of representing the same layout. If I then save as Word 2003, while the specific way to represent the layout has changed, the user doesn’t see the difference nor care.

If, on the other hand, that Word 97 document doesn’t look the same in Word 2003, it really doesn’t matter to the user if problem is a bad import filter or if Word 2003 doesn’t support the same features from Word 97. (Maybe they used flame text.) So a format that technically captures all the information needed to exactly recreate a document is utterly useless without something that can render it the same way.

Okay, so we need long-term media, and we need to choose a format that is popular enough that there will still be import filters for it in the foreseeable future. Eventually we’ll still reach the end of those paths. Either the disks will degrade, or the file format will be so out of date that no one makes import filters any more. When that happens, the only way to keep our data will be to copy it to new media, and potentially in a new format.

What should that format look like? We’ve already got PDF, which is based on how something looks in print. We’ve got various audio and video formats, which deal with playing an uninterrupted stream. But what about interactive/animated documents designed for online viewing?

Believe it or not, I’m going to suggest a Microsoft solution, though it’s one they haven’t thought to apply this way: PowerPoint. Today nearly everyone has a viewer, but not so long ago most of the slideshows I got were executables. If you had PowerPoint installed you could open the executable and edit the slideshow the same way you can edit a PDF if you have Acrobat.

As much as people complain about the bloat that Word adds to simple files, I think the future of file distribution will be to package the viewer along with the file. At some point storage becomes cheaper than the hassle of constantly updating all those obsolete file formats. The only question is how low a level the viewers will be written to: OS family, processor architecture, anything that runs C, etc.

The day I got a lot smarter

One sign of intelligence is the ability to learn from your mistakes. An even better sign is the ability to learn from someone else’s mistakes. Unfortunately, we don’t always have the luxury of watching someone else learn a valuable lesson, and we have to do it ourselves. But if we pay attention, sometimes we get to learn multiple lessons from one mistake. (Lucky us.)

Case in point: Dealing with a crisis. I was managing a group of web developers, and the project lead on an integration with our largest client was going on vacation. He assured me his backup was fully trained, and would be able to deal with any issues. He left on Friday, and we deployed some new code on Monday. Everything looked good.

Time passes …

On Wednesday at about 4 p.m., we got a call asking about an order. We couldn’t find it in our system. From what we could tell, the branch that placed the order wasn’t set up to use our system yet, so we shouldn’t have the order. At 5 I let the backup go home for the day while I worked on writing up what we’d found. I sent an internal email explaining what I believed had happened. I said that I would call the client and explain why we didn’t have the order, and that they should check their old system.

While double-checking the deployment plan, I discovered that the new branch actually was on our new system … as of that Monday. That’s part of what was included in the new code. That’s when I got the shiver down my spine. By that time the backup, whose house was conveniently in a patch of bad cell coverage, was gone. The lead was on vacation. “Okay,” I thought, “I’ve seen most of this code, in fact I’ve written a good bit of it. I can figure this out.”

Stop laughing. It sounded good at the time.

To make a long story short (Too late!) we hadn’t been accepting orders for three days from several branches, but had been returning confirmations for them. It was somewhere around 3 a.m. when I finally thought I knew exactly how many orders we had dropped, though I hadn’t found the actual bug in the code yet. I created a spreadsheet with the list of affected orders. At one point I used Excel’s drag-to-copy feature to fill a range of cells with the branch number for a set of orders.

Did you know Excel will automatically increment a number if you drag to copy? Yes, I know it too. At 11:30 in the morning today I know it. At 3 a.m. that night I apparently didn’t know that. So I sent it to the client with non-existent branch numbers that I didn’t double-check. “Oops” apparently doesn’t quite cover it.

The reveal

The next morning on a conference call with the client, my boss, his boss, and several other people, we were going over the spreadsheet when someone noticed the problem. To me, it seemed obvious that it was a simple cut-and-paste error on the spreadsheet. But someone — a co-worker, believe it or not — decided to ask, “Are you sure? Because I don’t see those other two branches on here either.” After dumbly admitting that I didn’t know anything about any other two branches, I ended the call so I could go figure out what was happening.

Now I had apparently demonstrated that I didn’t actually know what was wrong, that I had no idea of the scope of it, and that I was trying to cover it up. Yay me. We called in the lead (whose vacation was at home doing renovations) and started going through the code. I finally found the cause of the error, and it caused exactly the list of errors that I had sent out early that morning, except for the cut-and-paste error. The “other two branches” turned out to be from the previous night’s email, where I had specifically said those branches were not affected by the problem.

Within two hours, we had the code fixed and all the orders recovered. So everyone’s happy, right? If you think so, then you haven’t yet learned the lessons I did that day.

  1. No matter how urgently someone says they need an answer, the wrong answer won’t help.
  2. If it looks like the wrong answer, it might as well be the wrong answer. This doesn’t mean counter-intuitive answers can’t be right. It means that presentation and the ability to support your conclusion count.
  3. If you didn’t create the problem, always give the person who did the first chance to fix it.
  4. If someone knows more about a topic than you do, have them check your work.
  5. Don’t make important decisions on too little sleep.
  6. Before making a presentation to a client, review the materials with your co-workers.
  7. Don’t make important changes when key people are unavailable.

Looking at that list, I realize I already knew several of those lessons. So why did it take that incident to “learn” them? Because there’s a difference between knowing something, and believing it.