Wednesday, June 24, 2009

Startups as Functions

I used to worry about the efficiency of my code. I still do, but now that I'm working on a startup I have additional concerns. Now I have to worry about money as well.

I pay money to get web hosting for my site. I'm using AWS, which charges by the CPU/hour. For the smallest box, they charge $0.10/hr. Each of those boxes can support some number of users, and I make some amount of money off each user. Now I have an equation:

`"input money" - ($0.10 / "CPU-hr") * "CPU-hr" / "user" * "users"/"month" + "users"/ "month" *"money" / "user" = "output money"/"month"`

Wait a minute. I have money on both sides of the equation. But the input money and the output money can be different. And hopefully, output money > input money. What would it look like if I took the output money and fed it into the other side?

Obviously at this point we don't have a true equation because there's time involved, and that isn't modeled here. What have now is more like a recursive function: the output of one run through the function is the input of the next call.

if output money - expenses > input money, I have a successful company.

There's another input to the function here, because you need input users, (and you want to output happy users). If your users are happy, they'll return or recommend you to other users:

So what are the benefits of this mental model? If there are no benefits, there's no reason to bother with it. Well, I thought it was interesting. But more significantly, it turns company creation and analysis into an engineering problem, and we're good at those. It becomes easier to see the whole picture, so we can identify problems. There's also a large body of Engineering/CS/technical knowledge that can be applied to this model. Debugging, profiling, pipeline stalls, complexity analysis, all of these tools are easily applicable to this model.

Look at the function again. Each input is a parameter that can be tweaked. What happens when I improve the code to make users happier? What happens when I raise my advertising budget, attempting to acquire more users?

All successful companies have a similar function, it's just easy to see in a web startup because the cycle is simple. Retail stores have policies for what employees are supposed to do, and when to re-order goods. Some stores have even automated the process

There's one more input to the recursive function. Eric Ries has impressed on me the value of using A/B split testing and Google Analytics. What this means is that each time a user goes through the system, I collect data on their trip. I can use that data to analyze the system's performance, and make decisions about how to make the system more profitable. I can improve the code to make the users happier, increase monitization per user, or reduce costs:

It turns out, like startups, people are functions too. If the company is a function, then founders are higher-order functions. Every day, we take the existing function (company) and improve it. And now, I've derived the Y Combinator, 4 years after Paul Graham.

People reading this article are part of the loop too. Some percentage of readers will visit reasonr. They'll either use the site, or not, and like it, or not. All of that feedback is useful for making the company succeed, and all I had to do was write this blog post.

I was originally going to write an article about how my site isn't in a position to do A/B tests effectively yet, because I don't have enough traffic to make statistically sound judgments. This idea of "companies as functions" had been banging around in my head for a while, and the two ideas merged into A strange Douglas Hofstadter meta post, where talking about my idea results in more data to do A/B testing.

Wednesday, June 17, 2009

Preventing tests (or, 5 Whys in action)

I was working on some Clojure for my site, Reasonr today. I develop on my laptop, using Emacs, and I have a production box on AWS. I recently set up swank-clojure on the production web server, so I can SSH tunnel onto the box, and get a REPL. Very cool.

There is one downside however. I had tunneled into the box to check something out, and then forgot about. I went back to developing. I was ready to test my changes, and I ran (run-all-tests). Normally this would be fine, except my slime buffer was still connected to the production box, and run-all-tests does more than just unit tests, it does things like insert (fake) data into the database and makes sure it comes back out correctly. And I was still connected to the production box, so now that fake data is in my production DB.


After deleting all the crap data out of the DB, I thought of Eric Ries' 5 Whys, and set about trying to make sure this never happens again.

Step 1: Don't allow tests in production

(defn no-tests-for-you []
(println "You're in production, dumbass. No tests for you!"))

(defn prevent-tests []
(with-ns/with-ns 'clojure.contrib.test-is)
(def run-all-tests user/no-tests-for-you)
(def test-ns user/no-tests-for-you)
(def test-var user/no-tests-for-you))

(when (= winston.env/environment :production)

Here, I have a dummy function that reminds the user we're talking to the production box. I also have a function that redefines most of the ways to run clojure tests. with-ns is from clojure.contrib.with-ns, which evaluates body in the specified namespace. Thankfully, (run-all-tests) is the most common way for me to start tests, and the remaining ways all take more effort to run, so this net will likely catch most of my goofs. I also have a var, winston.env/environment which specifies if we're in production or development. If we're in production, we obviously don't want to run tests.

Step 2: Fix the prompt

A big part of my confusion was because I wasn't sure which box I was connected to. Let's fix that by modifying the repl prompt to print the hostname. I already had code that started a repl manually, rather than using the default one in clojure.main. All I had to do was supply a new definition for the repl-prompt

(def hostname (.trim (sh "hostname")))

(defn repl-prompt []
(printf "%s:%s=> " winston.env/hostname *ns*))

(defn start-repl []
(binding [clojure.main/repl-prompt winston.repl/repl-prompt]

I used the sh function from to determine the hostname, and store that in a variable. Then I modified the repl prompt to print hostname:namespace=> rather than just "namespace=>". binding is clojure's way to safely monkey patch. Inside the scope of binding, all calls to clojure.main/repl-prompt will be replaced with calls to winston.repl/repl-prompt.

Step 3: Fix Slime?

I sort of cheated on step 2. Everything I wrote works great, but it doesn't fix Slime. Apparently slime hardcodes the definition of the prompt. You'll see the modified prompt when using the repl at the console, but not via slime. And I don't know of a good way to fix the slime prompt. Anyone out there have ideas?