Friday, April 3, 2015

Toolkits for the Mind

When the Japanese computer scientist Yukihiro Matsumoto decided to create Ruby, a programming language that has helped build Twitter, Hulu, and much of the modern Web, he was chasing an idea from a 1966 science fiction novel called Babel-17 by Samuel R. Delany. At the book’s heart is an invented language of the same name that upgrades the minds of all those who speak it. “Babel-17 is such an exact analytical language, it almost assures you technical mastery of any situation you look at,” the protagonist says at one point. With Ruby, Matsumoto wanted the same thing: to reprogram and improve the way programmers think.

It sounds grandiose, but Matsumoto’s isn’t a fringe view. Software developers as a species tend to be convinced that programming languages have a grip on the mind strong enough to change the way you approach problems—even to change which problems you think to solve. It’s how they size up companies, products, their peers: “What language do you use?”

That can help outsiders understand the software companies that have become so powerful and valuable, and the products and services that infuse our lives. A decision that seems like the most inside kind of inside baseball—whether someone builds a new thing using, say, Ruby or PHP or C—can suddenly affect us all. If you want to know why Facebook looks and works the way it does and what kinds of things it can do for and to us next, you need to know something about PHP, the programming language Mark Zuckerberg built it with.

Among programmers, PHP is perhaps the least respected of all programming languages. A now canonical blog post on its flaws described it as “a fractal of bad design,” and those who willingly use it are seen as amateurs. “There’s this myth of the brilliant engineering that went into Facebook,” says Jeff Atwood, co-creator of the popular programming question–and-answer site Stack Overflow. “But they were building PHP code in Windows XP. They were hackers in almost the derogatory sense of the word.” In the space of 10 minutes, Atwood called PHP “a shambling monster,” “a pandemic,” and a haunted house whose residents have come to love the ghosts.

Things reviewed

Babel-17 By Samuel R. Delany
1966
Real World OCaml By Yaron Minsky et al.
O’Reilly, 2013 PHP Hack Scala

Most successful programming languages have an overall philosophy or set of guiding principles that organize their vocabulary and grammar—the set of possible instructions they make available to the programmer—into a logical whole. PHP doesn’t. Its creator, Rasmus Lerdorf, freely admits he just cobbled it together. “I don’t know how to stop it,” he said in a 2003 interview. “I have absolutely no idea how to write a programming language—I just kept adding the next logical step along the way.”

Programmers’ favorite example is a PHP function called “mysql_escape_string,” which rids a query of malicious input before sending it off to a database. (For an example of a malicious input, think of a form on a website that asks for your e-mail address; a hacker can enter code in that slot to force the site to cough up passwords.) When a bug was discovered in the function, a new version was added, called “mysql_real_escape_string,” but the original was not replaced. The result is a bit like having two similar-looking buttons right next to each other in an airline cockpit: one that puts the landing gear down and one that puts it down safely. It’s not just an affront to common sense—it’s a recipe for disaster.

Yet despite the widespread contempt for PHP, much of the Web was built on its back. PHP powers 39 percent of all domains, by one estimate. Facebook, Wikipedia, and the leading publishing platform WordPress are all PHP projects. That’s because PHP, for all its flaws, is perfect for getting started. The name originally stood for “personal home page.” It made it easy to add dynamic content like the date or a user’s name to static HTML pages. PHP allowed the leap from tinkering with a website to writing a Web application to be so small as to be imperceptible. You didn’t need to be a pro.

PHP’s get-going-ness was crucial to the success of Wikipedia, says Ori Livneh, a principal software engineer at the Wikimedia Foundation, which operates the project. “I’ve always loathed PHP,” he tells me. The project suffers from large-scale design flaws as a result of its reliance on the language. (They are partly why the foundation didn’t make Wikipedia pages available in a version adapted for mobile devices until 2008, and why the site didn’t get a user-friendly editing interface until 2013.) But PHP allowed people who weren’t—or were barely—software engineers to contribute new features. It’s how Wikipedia entries came to display hieroglyphics on Egyptology pages, for instance, and handle sheet music.

The programming language PHP ­created and sustains Facebook’s move-fast, hacker-oriented corporate culture.

You wouldn’t have built Google in PHP, because Google, to become Google, needed to do exactly one thing very well—it needed search to be spare and fast and meticulously well engineered. It was made with more refined and powerful languages, such as Java and C++. Facebook, by contrast, is a bazaar of small experiments, a smorgasbord of buttons, feeds, and gizmos trying to capture your attention. PHP is made for making—for cooking up features quickly.

You can almost imagine Zuckerberg in his Harvard dorm room on the fateful day that Facebook was born, doing the least he could to get his site online. The Web moves so fast, and users are so fickle, that the only way you’ll ever be able to capture the moment is by being first. It didn’t matter if he made a big ball of mud, or a plate of spaghetti, or a horrible hose cabinet (to borrow from programmers’ rich lexicon for describing messy code). He got the thing done. People could use it. He wasn’t thinking about beautiful code; he was thinking about his friends logging in to “Thefacebook” to look at pictures of girls they knew.

Today Facebook is worth more than $200 billion and there are signs all over the walls at its offices: “Done is better than perfect”; “Move fast and break things.” These bold messages are supposed to keep employees in tune with the company’s “hacker” culture. But these are precisely PHP’s values. Moving fast and breaking things is in fact so much the essence of PHP that anyone who “speaks” the language indelibly thinks that way. You might say that the language itself created and sustains Facebook’s culture.

The secret weapon

If you wanted to find the exact opposite of PHP, a kind of natural experiment to show you what the other extreme looked like, you couldn’t do much better than the self-serious Lower Manhattan headquarters of the financial trading firm Jane Street Capital. The 400-person company claims to be responsible for roughly 2 percent of daily equity trading volume in the United States.

When I meet Yaron Minsky, Jane Street’s head of technology, he’s sitting at a desk with a working Enigma machine beside him, one of only a few dozen of the World War II code devices left in the world. I would think it the clear winner of the contest for Coolest Secret Weapon in the Room if it weren’t for the way he keeps talking about an obscure programming language called OCaml. Minsky, a computer science PhD, convinced his employer 10 years ago to rewrite the company’s entire trading system in OCaml. Before that, almost nobody used the language for actual work; it was developed at a French research institute by academics trying to improve a computer system that automatically proves mathematical theorems. But Minsky thought OCaml, which he had gotten to know in grad school, could replace the complex Excel spreadsheets that powered Jane Street’s trading systems.

OCaml’s big selling point is its “type system,” which is something like Microsoft Word’s grammar checker, except that instead of just putting a squiggly green line underneath code it thinks is wrong, it won’t let you run it. Programs written with a type system tend to be far more reliable than those written without one—useful when a program might trade $30 billion on a big day.

Minsky says that by catching bugs, OCaml’s type system allows Jane Street’s coders to focus on loftier problems. One wonders if they have internalized the system’s constant nagging over time, so that OCaml has become a kind of Newspeak that makes it impossible to think bad thoughts.

The catch is that for the type checker to do its job, the programmers have to add complex annotations to their code. It’s as if Word’s grammar checker required you to diagram all your sentences. Writing code with type constraints can be a nuisance, even demoralizing. To make it worse, OCaml, more than most other programming languages, traffics in a kind of deep abstract math far beyond most coders. The language’s rigor is like catnip to some people, though, giving Jane Street an unusual advantage in the tight hiring market for programmers. Software developers mostly join Facebook and Wikipedia in spite of PHP. Minsky says that OCaml—along with his book Real World OCaml—helps lure a steady supply of high-quality candidates. The attraction isn’t just the language but the kind of people who use it. Jane Street is a company where they play four-person chess in the break room. The culture of competitive intelligence and the use of a fancy programming language seem to go hand in hand.

Google appears to be trying to pull off a similar trick with Go, a high–performance programming language it developed. Intended to make the workings of the Web more elegant and efficient, it’s good for developing the kind of high-stakes software needed to run the collections of servers behind large Web services. It also acts as something like a dog whistle to coders interested in the new and the difficult.

Growing up

In late 2010, Facebook was having a crisis. PHP was not built for performance, but it was being asked to perform. The site was growing so fast it seemed that if something didn’t change fairly drastically, it would start falling over.

Switching languages altogether wasn’t an option. Facebook had millions of lines of PHP code, thousands of engineers expert in writing it, and more than half a billion users. Instead, a small team of senior engineers was assigned to a special project to invent a way for Facebook to keep functioning without giving up on its hacky mother tongue.

One part of the solution was to create a piece of software—a compiler—that would translate Facebook’s PHP code into much faster C++ code. The other was a feat of computer linguistic engineering that let Facebook’s programmers keep their PHP-ian culture but write more reliable code.

Startups can cleverly use the power of programming languages to manipulate their organizational psychology.

The rescue squad did it by inventing a dialect of PHP called Hack. Hack is PHP with an optional type system; that is, you can write plain old quick and dirty PHP—or, if you so choose, you can tie yourself to the mast, adding annotations to let the type system check the correctness of your code. That this type checker is written entirely in OCaml is no coincidence. Facebook wanted its coders to keep moving fast in the comfort of their native tongue, but it didn’t want them to have to break things as they did it. (Last year Zuckerberg announced a new engineering slogan: “Move fast with stable infra,” using the hacker shorthand for the infrastructure that keeps the site running.)

Around the same time, Twitter underwent a similar transformation. The service was originally built with Ruby on Rails—a popular Web programming framework created using Matsumoto’s Ruby and inspired in large part by PHP. Then came the deluge of users. When someone with hundreds of thousands of followers tweeted, hundreds of thousands of other people’s timelines had to be immediately updated. Big tweets like that would frequently overwhelm the system and force engineers to take the site down to allow it to catch up. They did it so often that the “fail whale” on the company’s maintenance page became famous in its own right. Twitter stopped the bleeding by replacing large pieces of the service’s plumbing with a language called Scala. It should not be surprising that Scala, like OCaml, was developed by academics, has a powerful type system, and prizes correctness and performance even at the expense of the individual programmers’ freedom and delight in their craft.

No comments:

Post a Comment