Friday, February 25, 2011

Why do they call it the internets?

If you're like most people, you know to leave well enough alone and not ask prying questions like the one in the title. However, some might have a nagging demon in the back of their mind that wonders what exactly happens when he or she types in a Google search, sends an email, or streams music on Pandora. Well, maybe you'll finally get your answer... it's magic!

Not actually. I'm going to try and de-magicify some part of the operation of the Internet, but the topic spans several college semesters, so I'm not going to get to everything in this particular post.


When you plug your computer into an ethernet cable or connect to a wireless network, it sends and receives several messages that announce its presence on the network and allow it to learn the location of certain key machines on the network. These messages are obviously sent as electrical signals, but I'm not going to delve into the alchemy of electrical engineering. Once a computer is connected to a network, it can see all of the traffic that is flowing on that network. This is why you should be wary of sending sensitive information over public WiFi networks. Chances are that your bank encrypts its communication, but many other websites do not. You're connected to the network, and you can send and receive messages on that network (this network, by the way, may span one wireless access point or a whole institution). Now what?

Connecting to your local network is useful in some cases, but Google, Facebook, Yahoo, and Pandora aren't on your local network. This is where the Internet or "Inter-network" comes in. In order to communicate between insular networks (all of which may be speaking a different language), a new standard is needed--the Internet Protocol. You've heard of an IP address. Well, this is where the name comes from. Whenever you join a network, you are assigned an IP address that uniquely identifies your computer in the entire world of computers on the Internet. All the other computers on the Internet have their own unique IP addresses which act like mailing addresses.

The Internet is a "packet switched" network. This means that data flies around in quantized amounts. Contrast this with a telephone network where the signal flows back and forth continuously. The quanta of data are called packets, and they carry all of the traffic on the Internet e.g. email, web pages, streaming music, and downloads.

Circuit switched telephone network

When you type in a Google search, your computer sends and receives a minimum of around 20 packets. Each of these packets carries with it the IP address of both the sender and the recipient. This information allows all of the intermediate computers on the information's path to forward the packets appropriately. These intermediate computers, which you've heard called routers, maintain an address book that tells them where to send each packet based on its destination.

All of this communication via sending and receiving packets over the Internet allows your computer to access content stored on remote computers--servers. That's the real purpose of the Internet, anyway.

When you pull up a browser you, like me, probably don't consider the incredible complexity hiding behind the YouTube video you're watching or the New York Times article you're reading. I tried my best to rescue your browsing from the realm of magic and reclaim it for the world of technology. I don't know if I've demystified things or muddled them. Thankfully you can blissfully disregard everything I've just said and continue your merry surfing.

Wednesday, February 23, 2011

The Mechanical Turk: Using Humans to do Computers' Work

People commonly believe that May 11, 1997 is the day when computers officially conquered the human race with respect to the game of chess. That was the day that the IBM computer Deep Blue bested world champion Garry Kasparov in a six game battle of wits. What most people don't realize is that in the late 18th century, a chess playing robot routinely defeated human opponents. The robot, known as the Mechanical Turk, toured Europe for decades. Crowds raved over the chess playing robot... that is until it was exposed as a complete fraud. The contraption was an elaborate hoax operated by a human hiding inside. Though the Mechanical Turk did not accomplish its original chess playing purpose, it ignominiously birthed the practice of using humans to do work too hard for computers of the day.

Mechanical Turk
Some may be more familiar with the term Mechanical Turk as it refers to Amazon's crowd sourcing Internet marketplace. Mechanical Turk in this context is a service with which clients can accomplish tasks that are simple for humans but difficult for computers (termed Human Intelligence Tasks). For example, humans can easily recognize whether or not a picture contains a person, but computers have much more difficulty. Requesters can submit tasks to Mechanical Turk, and workers are paid a low per-unit bounty for completing the tasks. Examples of HITS on the site are podcast transcribing and image tagging. While paying humans small amounts for performing simple tasks may scale in some domains, in other applications the model is unsustainable. One example where the process does not work is book digitization.

Google embarked on the massively ambitious goal of digitizing all the world's books in 2004. When approaching such a monolithic task, developing new, specialized tools is often a good investment. And invest Google did. They poured resources into optical character recognition (the software that converts images into text), scanning techniques, and an interface by which the books could be accessed. Google's scanners can "read" 1000 pages an hour. This is simply not possible with almost any reasonable number of human workers. As of 2010, the company estimates that they have digitized over 10% of the world's books.


Allow me one last piece of background. I swear it's worth it. In your travels over the Internets, you have certainly encountered a form requesting that you type the distorted letters that you see in a picture. These simple tests, called captchas, are designed to be easy for humans and difficult for computers. Using captchas prevents malicious software from signing up for millions of free email accounts (the ultimate triumph... just think about how much email you could send). Captchas are tests that computers can generate and know the answer to but cannot solve themselves.

And... the punch line. Even after all of the investment and research devoted to optical character recognition, the software still stumbles in difficult cases such as smudged or distorted words. Enter reCAPTCHA. Acquired by Google in 2009, reCAPTCHA is a system that uses captchas to digitize the parts of books that are too difficult for computers to read. The system is genius in its simplicity. Instead of displaying an image of one word distorted by a computer, the system shows an image of two distorted words side by side--one of them a regular captcha and one a difficult to read word from the expanse of the Google Books project. Because the two types of images are indistinguishable, the user correctly translates both, unwittingly contributing to the digitization of the world's literature. The result of the captcha is transported back to the original source and inserted in the place of the smudged word in the digital version of the book.

Example of a captcha

reCAPTCHA is a runaway success. It digitizes over 200 million words per day. Most people are unaware that they are effortlessly accomplishing useful work while they go about their daily lives. I'm sure the conspiracy theorist in you is frantically searching for what other work you're being tricked into performing. Tell him or her to go back to contemplating the Kennedy assassination and the moon landing. That way, the innovator in you can start asking what other difficult problems could be solved with simple, elegant, powerful systems like reCAPTCHA.

Tuesday, February 22, 2011

Why “Bubble 2.0” isn’t going to burst


I'm excited to add a second guest blogger to the already bafflingly strong roster. Our intrepid guest blogger, AK47, has been gifted with Rain Man's intellect and Elvis Presley's charm. He can effortlessly squat thousands of pounds and conduct theoretical algorithmic research... at the same time. Long story short, you're going to want to listen to what he has to say.

- 44 Maagnum


There’s been some interesting speculation lately as to whether we’re in the midst of a second Internet bubble, one some have labeled “Bubble 2.0”. There seem to be a lot of naysayers questioning the $50 billion valuation of Facebook, or more recently the $10+ billion valuations of Twitter, Zynga, and Groupon. Some have claimed that these companies simply don’t have the cash flows to support this kind of valuation, that they aren’t on par with brick and mortar companies with similar market caps, and that greedy investors are simply repeating their mistakes of late 1990s.

Internet bubble

But this “bubble” is of quite a different style than the first. To me, these current valuations seem much more reasonable than their counterparts from the late 90s. We’re not talking about Pets.com—which investors sank $300 million into for a whopping first year earning of $619.00—we’re talking about Internet behemoths like Facebook, which already has 500 million+ users, and which saw annual earnings of around $800 million in 2009. Internet gaming mogul Zynga saw $250 million in revenue in ‘09. The biggest difference between the dot-com bubble and what’s occurring right now is that these current companies are actually generating revenue. These companies are household names, and are providing services that tens or even hundreds of millions of people worldwide use on a daily basis. The unbelievable followership that these companies hold mean they certainly aren’t the type of company that will simply vanish into thin air like many dot-coms did in the early 2000s.



I don’t mean to say that these valuations are spot-on—it’s probably next to impossible to predict how much Zynga is actually worth, or if Groupon can continue its current growth rate, but I will say confidently that these valuations won’t be the orders of magnitude off that valuations during the first dot-com bubble were.  There’s very little risk of Facebook simply going bankrupt overnight, or Twitter just closing up shop tomorrow. If their valuations are wrong following their IPOs, their prices will shift slowly-- similar to how the stocks of most normal companies vary.


That said, it’s still important to be wary of the kind of hysteria that was the downfall of so many investors during the first dot-com bubble. The fact is that for every successful new tech company, there are many, many more failed ones. Though some of the few that survived the dot-com era are now thriving companies (see eBay, Amazon, etc.), there were far more failures.  A key characteristic of bubbles is this type of hysteria, and so far at least, we simply haven’t seen that. It’s certainly debatable whether Facebook is worth $50 billion, $25 billion, or $100 billion, but it’s probably not reasonable to value it with a figure in the millions. That’s the major difference between this “bubble” and the last.


When the dot-com bubble burst, the number of new websites being founded plummeted, but that didn’t mean Internet usage or interest plummeted. The Internet was still growing, and we’re reaching a time now where more people are more comfortable with the Internet than they’ve ever been in the past. Average web users continue to be more willing to open up their wallets and dish out their credit card number to spend money online, and more businesses continue to see the benefit and feasibility of advertising online. And as long as that trend continues, there will be room for companies like Facebook, Twitter, Zynga, Groupon and others to be worth their billions.

Saturday, February 19, 2011

From Beer Summit to Tech Summit: Obama Clinking Glasses with Tech Elite

President Obama has a long history of sitting down with warring parties to discuss matters in a civilized fashion. On July 30, 2009, President Obama sat down with Sgt. James Crowley of the Cambridge police department and Henry Louis Gates of Harvard to smooth over an alleged case of racial profiling that resulted in Gates's arrest. You can read plenty about that emotionally charged case all over the internet because it turned into a media circus. Encouragingly, Obama has moved beyond the role of babysitting adults and teaching them how to treat each other with respect. This week he sat down to dinner at the home of legendary venture capitalist John Doerr (Netscape, Google, Amazon, enough said) with leaders of seemingly every important tech behemoth in the nation. Now that's a summit worth writing about.



The guest list reads like a TechCrunch headline reel. Mark Zuckerberg, Steve Jobs, Eric Schmidt, Carol Bartz (Yahoo!), John Chambers (Cisco), Dick Costolo (Twitter), Larry Ellison (Oracle), Reed Hastings (Netflix), and Art Levinson (Genetech) were in attendance. What could Obama possibly need to discuss with this technological cabal that necessitated bringing them all together under one roof?

Obama has spent a good portion of his time lately parrying criticism over the laggard economy. Add America's lingering paranoia over its struggle to remain the world's sole superpower (as evidenced by the Amy Chua fiasco), and you can see that he's had a lot on his plate. What's one industry that has been both a reliable engine for economic growth and an area where the US of A remains the undisputed leader? You guessed correctly if you guessed technology (software would have been extra credit).

"Promise to behave?"

The President's willingness to reach out to Silicon Valley, the home of some of the most vehement free market proponents, is an encouraging departure from partisan stereotypes. Combine this gesture with the comments he made during the State of the Union and the opinion piece he wrote for the Wall Street Journal, and Obama is in danger of becoming a pro-business Democrat. In his editorial, Obama writes, "America's free market has not only been the source of dazzling ideas and path-breaking products, it has also been the greatest force for prosperity the world has ever known... But throughout our history, one of the reasons the free market has worked is that we have sought the proper balance." He suggests that America's economy works because it encourages risk takers while providing the regulatory support for those entrepreneurs to succeed and reap the benefits.

I'm more than a little skeptical of anyone who "orders a government-wide review of the rules already on the books to remove outdated regulations" (Obama's words). Excuse me if I'm worried that a simple pronouncement can fix 200 years of ossfied bureaucracy. Talk is cheap, but I'm certainly willing to give him a chance. Acknowledging the centrality of innovative companies like Facebook, Google, Apple, and Twitter to the economic recovery is a great first step. Obama, here's a toast to you.

P.S. For some hilarious tech satire, check out the first comment here.

Friday, February 18, 2011

Watson Wrap-up: Why Humankind's Inferiority Complex is Unfounded

Ben Templeton wraps up the Robot Jeopardy Fiasco of 2011 with a refreshingly sober dose of anti-hysteria.

- 44 Maagnum

Well, days 2 and 3 were significantly more sensational that day 1, and produced much more of a reaction than the first day. I scanned some of the comments from the public on online articles from some major news outlets. Widely regarded as the lowest-quality thinking that can be found anywhere in the world, these comment threads provided me with a few points that seemed worth discussing.

Meatbag concedes to Watson

First, the huge margin of victory for Watson: A common reaction (online and live) was ‘of course Watson won, he gets the questions in text, and he can press the button so much faster!’

Both of these things are true, but also not the real point. The reality is that actually appearing on Jeopardy is basically a publicity stunt for IBM, to capture the imagination of a society that is increasingly focused on Farmville, and decreasingly on using technology to resolve problems and conflicts. But while IBM definitely cares about engaging the public (otherwise they wouldn’t have bothered with the whole thing at all), it’s also concerned with demonstrating a huge scientific breakthrough to the technological community. To that end, the fact that Watson can answer questions posed in Natural Language is evident from the fact that it produces the correct response in 3 seconds, not by the $70,000 that it accumulates (which is, incidentally, the price of 2 of the 2,880 Power7 servers that make up the supercomputer, available for purchase on IBM’s website). And while the public is largely concerned with the perception of the supremacy of the computer, the truth is that Jeopardy is, in and of itself, trivia.

Worthwhile Computing Challenge

But while people seemed quite passionate about Watson’s victory over Mr. Jennings and Mr. Rutter, they seemed thoroughly unimpressed with the technological achievement. So I’m going to talk, very briefly, about the challenge that Watson faces, and the significance of its level of success.

Computers only speak languages which are unambiguous. In other words, when a computer programmer writes a program, the computer can only understand it if there is exactly one possible interpretation. An example of such a sentence in English might be: “If x is equal to one, then set y equal to two.” This sentence is very straightforward, and essentially only has one interpretation, under the definitions of all the involved words.

The English language (and every human language, although to interestingly varying degrees) is not a completely unambiguous language. Humans generally (although not always) can pick the correct meaning of a sentence through experience with the language, accumulated over several years. Watson lacks this experience.

Consider the following sentence: “I’m out of the building.” This could take several different meanings, all of which are logical to varying degrees. The most obvious one is that the speaker is not located in the building. Another interpretation that would make sense but for the definition of “building” is that the speaker is using “out” as if he or she were a waiter telling a customer that “we’re out of the salmon tonight.” There are plenty more, as each word has lots of definitions, and many of them work from a strictly grammatical sense.

The fact that people don’t keep a stock of buildings is what provides the computational difficulty of Natural Language Processing (or NLP, not to be confused with Neuro-Linguistic Programming, a controversial psycho-therapeutic technique that shares the same acronym). Although everybody knows that one can’t be “out of the building” in that sense, there is absolutely no indication of that in the definition of any of the words. What IBM accomplished over 4 years in Watson was getting a computer to figure out which interpretation to use with remarkable accuracy. It does this by searching through massive amounts of data and performing statistical analyses, connecting words to their contexts, seeing which words go with which other words. In this way, Watson is able to figure out which interpretations to use and what the proper context is, without the human experience of learning language.

Given that bit of insight into how Watson functions, let’s take a quick look into the significance of IBM’s achievement. There is a lot of talk of Skynet, and the robot revolution, etc. An important thing to recognize is that Watson is a breakthrough in Computational Linguistics, not really Artificial Intelligence in general.


A Few Steps Short of Skynet

At the same time, the process of connecting words into their contexts and figuring out what a sentence truly means is an incredibly important tool. The most commonly discussed application is diagnostic medicine. In this situation, symptoms are analogous to the ambiguous words of a sentence (in that symptoms, like words, can have a variety of different meanings depending on the context), and Watson could synthesize a diagnosis from the combination of a set of symptoms in their context.

Medicine is one of many cases where the ability to synthesize natural language data into a context could be game-changing. But ultimately, while extraordinary, the technological breakthrough of IBM is not something that is yet able to replace human intelligence. CNN sums it up well by positing that “Watson’s eventual commercial incarnation will be a tool, not a human replacement.”

There are lots of other concerns and issues, but the three above seemed the most important. While yes, the publicity stunt aspect of Watson was exactly that, the technology is undeniably impressive. While the physical hardware is fairly cool, Watson falls short of the most powerful computer in the world. The breakthrough of Watson is the algorithmic development in statistical analysis which allows Watson to be incredibly effective in one of the most challenging fields of computer science, Natural Language Processing.

Wednesday, February 16, 2011

Mark Zuckerberg: The Friend Everyone Loves to Hate Part 2

And..... exhale! The second part of my pro-Zuckerberg rant is here. The first part, posted earlier this week can be found here. If you haven't read it yet, do so now because I'm going to dive right in.

Facebook's origins are undeniably intertwined with a project that Mark Zuckerberg agreed to work on with his Harvard classmates, the Winklevoss twins. After the fact, the twins claim that they had the idea for the site. Zuckerberg holds that their site was more of a dating site than a social network. The truth undoubtedly lies somewhere in between. Regardless, the fact that the twins had an idea in the same realm as Facebook in no way implies that they would have built an online empire worth $70 billion--that creation is uniquely Zuckerberg's. The twins were fortunate to settle for $65 million though they claim vehemently that they sued out of principle. Technology journalist Kara Swisher echoes my sentiments: "They got paid $65 million for one, medium idea that they never could have made into anything." This from someone who uncharitably refers to Zuckerberg as the "toddler CEO."

The Social Network movie poster


Zuckerberg's treatment of his friend and cofounder Eduardo Saverin is another focal point for public acrimony. This conflict, which features prominently in the movie The Social Network, was almost certainly another one of his mistakes. However, like most of the movie, it is doubtful that the conflict played out in real life like it did on the silver screen. First of all, Saverin spent the pivotal summer when Facebook moved to Palo Alto working a financial internship in New York. This was a critical period in which Facebook built its competitive advantage, and Saverin was decidedly not as committed as other members of the team. After Saverin's departure from Facebook, he won a %5 stake in the company through legal action. This slice is now valued at $3.5 billion--ample compensation for his contribution.

Perhaps the most telling indicator that Zuckerberg isn't the callous monster genius the media shows us is the fact that profiles such as the Time Person of the Year piece portray him as a well intentioned, well balanced leader. The main difference between the Time piece and the vast majority of other coverage is the lack of agenda. The Time Person of the Year award goes to the person or idea that "for better or worse, has most influenced the events of the preceeding year." With a declared purpose such as this one, the magazine has no requirement to portray the winner in anything but an accurate light.



The author of the extensive article spent more time close to Zuckerberg than perhaps any other journalist. The author describes his elevated carriage of his chin and writes, "In the movie, this played as him looking down his nose at you, but in real life it's more like he's standing on his tiptoes, trying to see over something." Clearly another example of some innocuous detail becoming the genesis of some negative character trait--in this case condescension--trumpeted in the media.

I have no access to Mark Zuckerberg other than what I can get through the media and film (I know it's surprising. The Carpe Daemon name doesn't carry that much clout... yet). However, by considering portrayals of the Facebook CEO and evaluating the motivation each has for perpetrating a particular agenda, I have come to believe that he's much more the awkward, genuine, ambitious entrepreneur and much less the kniving, greedy, megalomaniac. Hopefully, after this post, I don't stand alone.

Tuesday, February 15, 2011

Watson vs. Mankind: Round 1

Ben Templeton's take on Round 1 of the knock down, drag out fight between artificially and genuinely intelligent beings.


- 44 Maagnum


Night 1 of Watson vs. Ken Jennings vs. Brad Rutter (anybody else think Brad looks a bit like Hans Gruber from Die Hard?) was exciting and entertaining. Watson started off well, jumping out to a early lead of $5200 while the next closest was $1000. As time went on, the humans pulled a bit back, as Watson gave some amusing errors. The most laughs came when Mr. Jennings gave “the 20’s” as the incorrect response to a clue, Watson buzzed in and suggested “What is the 1920’s?” The night ended with Watson and Mr. Rutter tied at $5000, and Ken trailing distantly with $2000.


I’m going to do a little bit on each night, providing a little commentary, “Carpe Daemon”-style. Maybe some people will tune in tomorrow night who missed tonight, after seeing the frenzy of media coverage (does this blog post technically make me a part of the media?). And they certainly should, if you ask me. It’s difficult to overstate the significance of this achievement, for society and technology. (If you want a quick rundown of the importance of this shindig, check out this post from a few days ago.)


Brad Rutter

I’m going to talk a little bit about what was probably the moment that most people will remember, which was Watson’s repetition of Ken’s incorrect response. The “answer” came from a category about “the decade in which X, Y, and Z occurred”, and Ken’s question was “what is the 20’s?” As the TV audience, we got to see Watson’s confidence values displayed on the screen. Without confidence above a certain threshold, Watson will keep doing calculations until it gets high enough. And in this case, we saw that Watson was certain that the answer was the 1920’s. There was a moment of pause as we waited to see if it was going for the buzzer. And then it did, making a fool of itself to anybody who managed to anthropomorphize Watson sufficiently (which one author certainly did, writing an article that consisted entirely of criticism of the aesthetic presentation of the supercomputer).

Hans Gruber

This moment was interesting and important for two reasons. First, I think it epitomized the difficulties of artificial intelligence. Watson’s capacity to err in a situation that is so trivial to human intelligence shows us how difficult the littlest things can be. In this case, perhaps the engineers could included a microphone and some voice recognition technology. But the problem of turning sound into words and the symbolic logic that represents them is far from trivial. The point is that to a computer, nothing comes easy, even things so simple as to make the audience chuckle a bit.

The second important idea to take from that moment is that while Watson got the wrong answer, it got the same wrong answer as Ken Jennings, who won Jeopardy 74 times in a row. I personally think that it’s very cool and pretty impressive that through 4 years of hard work and piles of electronic logic, the engineers and programmers at IBM managed to produce a machine that thinks in the same way as one of the human minds that is most adept at synthesizing information into knowledge, that of Mr. Jennings. And that accomplishment is far from trivial.

Tune in tomorrow at 7:00 for the next installment in this historic saga, on ABC. Seriously.

Monday, February 14, 2011

Mark Zuckerberg: The Friend Everyone Loves to Hate Part 1

I won't patronize anyone by explaining that a movie titled The Social Network came out last year. It is a small independent film that played at some local venues and generated minimal buzz. The plot features the plodding rise of a software company responsible for the creation of a website with little purpose and no realistic future prospects--think Pets.com. The main character, Mark Zuckerberg, is a nice, if somewhat uninteresting, nerd without diabolical plans or egomania--a character assessment that stirred insignificant amounts controversy and failed to set off a media firestorm.

Just let me know if I get carried away with myself.



Of course I'm actually talking about the blockbuster that became one of the most controversial films of the year, won four golden globes, and resulted in Zuckerberg's appearance alongside Jesse Eisenberg and Andy Samberg on SNL.

The Social Network portrays Zuckerberg as a lonely, bitter, egomaniac obsessed with popularity and power. Perhaps the movie's most pointed example is his ouster of his friend and cofounder Eduardo Saverin. In a dramatic scene, Saverin storms into the Facebook offices and berates Zuckerberg for betraying him, throws his computer to the ground, and threatens him. Zuckerberg's relationship with his Harvard classmates, the Winklevoss twins is perhaps the subject of even more controversy. In 2004, the twins approached Zuckerberg about helping them develop an idea that they had been working on for a year. The movie alleges that Zuckerberg stole this idea and turned it into Facebook. These portrayals combined with mania over privacy issues in the media conspire to produce a generally negative public opinion of the Facebook CEO. I'd like to dispel some of these misconceptions.

Portrayals of Zuckerberg such as The Social Network and this one on 60 Minutes are regrettably sensational and push a clear agenda. We don't have to look very far to find the incentive for sensationalizing the admittedly questionable origins of Facebook. The Social Network was created to sell tickets--not disseminate the truth. The movie makes no apology for glaring factual inaccuracies. For example, the movie implies that Zuckerberg's breakup with his girlfriend was a major motivation for his creation of the site when in reality he has been dating his current girlfriend since before he created Facebook. The fact that the creators of the blockbuster made their movie without regard for Zuckerberg's character is unfortunate.

Winklevi


The 60 Minutes special linked above is one of the purest example's of the media's baffling Zuckerberg witch hunt. Throughout the interview, Lesley Stahl, the interviewer pelts Zuckerberg with banal privacy questions and awaits his response with an annoyingly self-satisfied smirk. She rests her case on tenuous gotchas such as, "He vowed to never see the movie. On opening night, he changed his mind." Oh, Zuckerberg, you just can't muzzle your insatiable vanity, can you?

Watch the interview for yourself, but in my opinion Zuckerberg shows his sincerity by admitting mistakes and insisting that Facebook errs because of inexperience and enthusiasm rather than malice and greed. The interview reveals another reason the public doesn't trust Zuckerberg--he's awkward. He has trouble conveying his arguments in laymans terms, but since when has this crime carried the sentence of public character defamation?

Plenty of questions remain--especially concerning the murky origins of the company. I'll address these in Part 2 of my pro-Zuckerberg rant later this week.

Update: Part 2 is posted here.

Friday, February 11, 2011

Text editor? You mean, like Notepad?

By now it's no secret that I spend a great deal of my time programming computers. The problem is this: when I tell people what I'm working on, I see eyes glaze over and my heart sinks because I know that I've spooked the Luddite within. I want to be clear on this--I pass no judgement when this happens. I readily acknowledge that most of computer science is unspeakably arcane for most people. I haven't been able to figure out why I find it so interesting, but I can easily understand why so many others find it utterly boring. I want whoever I'm talking to to "get" it only because I think it's so great, not because I think they should be interested or "smart" enough to get it.

Text editor


I mean no condescension. I could benefit from someone holding my hand and walking me through plenty of topics outside my area of expertise. To this end, I am embarking on the task of demystifying just what us geeks mean when we say "computer programming"--a task I am almost certain is impossible. We're going to start slow. Subsequent posts will build on this foundation and before you know it we'll all be experts on how programming is done if not actual programming itself.

A lightweight, flexible, powerful text editor is a programmers best friend. Think of a text editor as a stripped down version of Microsoft Word without the lingering odor of decaying software giant. Most are only capable of editing "plain text." This is the biggest difference between a word processor and a text editor. It also means no bold, no italics, no fonts--just the characters themselves. This is useful because programming languages don't care about any of these things. All that matters is the characters. Leaving out all of the fancy formatting controls at the top just makes the text editor easier to use and more lightweight.

When I say a program is lightweight, I mostly mean that it doesn't do anything that you don't need it to do. It doesn't have a lot of features cluttering up the screen. As a desirable side effect, this means that the program loads super fast. If Word takes a couple to tens of seconds to load, Vim's load time is almost imperceptible. This is important because programmers need to jump into and out of files often, and loading a word processor each time would be exceedingly annoying.

Deemphasizing formatting allows text editors to focus on what they do best--edit text. Good text editors like Vim and Emacs offer the user myriad keyboard shortcuts so that even complex editing operations can be completed quickly. Learning keyboard shortcuts might sound bothersome, but when you spend a ton of time using a piece of software, it becomes second nature.

Just as programming isn't for everyone, text editors aren't for everyone. Documents produced with text editors look like crap. They are monospaced and don't include any formatting. This problem is exacerbated by the fact that the only text editor with which most people are familiar is Notepad--not a perennial favorite.

Hopefully I took you one small step toward familiarity with what computer programming is all about. Keep an eye out as I sprinkle more indespensible (dispensible) nuggets of knowledge in with future posts.

Thursday, February 10, 2011

It's Elementary, Watson: The Robot Uprising Begins (Ken Jennings is First on the List)

First in a two part series brought to you by Ben Templeton.


- 44 Maagnum


“We’re going to revolutionize industries at a level that has never been done before.” That quote came from the Senior Vice President of a little company called IBM. You might have heard of it. Forbes lists IBM as the 33rd largest company in the world.

So what fantastic product was Dr. John Kelly discussing? Perhaps some exciting new social media technology, all the rage in the tech industry today? Or some fancy multi-touch screen that can double as a coffee table (nope, Microsoft has that one covered (quasi-pun intended)).

No, IBM (admirably, in my opinion) stays away from the trendy topics in “pop technology”, the Facebooks and the Smartphones of the tech world. They stick to their strengths, and devote their billions of dollars to weighty computing problems on the scale of mainframes and supercomputers. Past game-changers have included System/360, RISC computer architecture, and the invention of DRAM, amongst countless others.

IBM’s new innovation which will, in their words, surpass all of these is the hardware and software necessary to perform one task: play Jeopardy. Yes, that’s right, the unprecedentedly revolutionary technology that IBM is debuting on Monday (7:00PM EST) is designed to play a game show. “Watson”, as the computer is named (after the company’s founder, and as a reference to the sidekick of the famous information synthesizer Sherlock Holmes) operates without the internet or any human interaction. It hears the clues as its opponents do and responds in a clearly computerized but acceptably non-creepy voice. It mechanically presses a button to buzz in, like all of the other contestants.



You (as well as IBM’s investors, I imagine) are probably curious as to why playing a game show is worth the mountains of money and manpower that were devoted to this project. In today’s post, I’ll discuss the significance of this breakthrough, and in the next one we’ll look into the technology that powers it.

The IBM engineers describe Jeopardy as “a playing field upon which we could do some science.” The significance of Jeopardy is the way in which the clues are presented. Computer Scientists and AI experts call it the problem of “natural language”. The challenge is to understand the hints that are posed in normal English. Especially difficult is the style of Jeopardy’s problems, with subtle puns, jokes, and categories. Consider the following clue, from the ironic (to a supercomputer, at least) category “Chicks Dig Me”. “This mystery author and her archeologist hubby dug in hopes of finding the lost Syrian city of Urkesh.” The difficulty posed by this clue is apparent. Watson has to parse the category and the clue to determine that the response is a female mystery author. Given the tangential connection between Urkesh and the female mystery author, running a quick check of the Encyclopedia Britannica entry on Urkesh is unlikely to yield especially useful results. In his human brain, a trivia-master like Ken Jennings has a neurological database of information. If Mr. Jennings knows of a female mystery author married to an archaeologist, that is almost certainly sufficient for him to answer the question. Watson doesn’t have the capacity to make this connection so easily. Despite those linguistic pitfalls, Watson buzzes in confidently, and with a hint of digitally syncopated sass, responds “Who is Agatha Christie,” in this bit of video released.



In the words of Dr. Kelly (Senior VP), “The rate of growth of information is surpassing our ability to understand it and extract knowledge from it.” The breakthrough of Watson is that it can synthesize true knowledge from a mass of information the way that the human mind can and (if it proves superior to the greatest Jeopardy champions) possibly even better. In this day and age of information overload (mostly due to the internet), a computer that can process excess information and provide insight is an unbelievably powerful tool.

This capacity for synthesis has incredibly far-reaching implications. For example, a fairly high percentage of medical errors come at the diagnostic stage, and many happen simply because the diagnosis is too slow. A computer like Watson could plausibly listen to a description of symptoms, analyze its database of information, and provide a diagnosis. Teamed with a human doctor, this system could avoid time-consuming consultations and provide a concrete statistical analysis of certainty rather than simply a doctor’s gut feeling.

The possibilities are endless. A computer with a capacity to synthesize insight from as much knowledge as Watson could detect economic trends that lead to recession in a way that a human wouldn’t be able to, or (less glamorously) it could power the customer relations process.

So how does this technological marvel perform the magic of natural language processing? Tune in next time for a quick tour of the computing techniques used to make Watson a Jeopardy champion. Coming some time before Monday, to a blog which is stored on a server to which you may or may not be geographically proximal.

Tuesday, February 8, 2011

Retweeting Revolution: Wael Ghonim and the Egyptian Conflict

I wrote last week about the internet blackout in Egypt. Some responded that the government's action may have been defensible on the grounds that it prevented the organization of angry, violent mobs and protected innocent citizens. While I think that explanation has its flaws, the Egyptian government's subsequent detention of Google executive Wael Ghonim provides a much more clear cut example of it's supression of dissent.



Wael Ghonim rose to prominence for organizing opposition to President Hosni Mubarak's incumbent Egyptian administration. His rise to star status begins with the death of Khaled Said, allegedly due to police brutality, on June 6, 2010. In response to the death, Ghonim created an Arabic language Facebook page whose title is translated "My Name is Khaled Said." Though this page was shut down by Facebook, another quickly sprang up in its place-- "We are All Khaled Said." This was the page that, on January 15th, announced the January 25th Tahrir Square protest.

Ghonim attended the protests which lasted for several days. He tweeted often throughout the 26th and 27th of January before the government shut down the Internet and cell networks. His tweets took on both a defiant and increasingly ominous tone.

From January 27th: "We want Facebook Twitter & SMS back. Blocking free speech is a crime."

Then: "Pray for Egypt. Very worried as it seems that government is planning a war crime tomorrow against people. We are all ready to die."

Wael Ghonim disappeared later that day.

Wael Ghonim

The Wall Street Journal reported yesterday that Ghonim was released from government detention. Today he gave a speech to the protesters who have filled Tahrir Square for the past two weeks. His release has become a rallying point for the opposition, and Ghonim himself has become an unofficial spokesman for the movement. Ghonim's rise to fame isn't shocking in its conclusion but rather in its humble beginning.

Perhaps the single event most responsible for Ghonim's current trajectory is his creation of the Khaled Said Facebook page. Let that sink in--he started a Facebook page that so threatened the Egyptian government they decided to detain an employee of one of the most powerful companies in the world. They must have known that the Western press would milk this story for days (it quickly became the most popular article on the Wall Street Journal's website and appeared on other news outlets). Using only Facebook, Twitter, and a BlackBerry, he so threatened a national government that they detained him at great cost to their credibility.

As might be expected from a Googler, Ghonim both used social media to great effect and had great faith in its power. On January 26th, he tweeted, "I said 1year ago that Internet will change the political scene in Egypt and some friends made fun of me :)" Nobody, least of all President Mubarak, seems to be making fun anymore.

I remember a time when MySpace, Facebook's precursor, was "just a music sharing site." I remember when Facebook was "just a place to show off pictures of the weekend." And I remember when Twitter was "just your Facebook status." These attitudes (some of which were mine) cannot persist in a world where governments are toppled by 140 character blog posts. Perhaps Ghonim, responding to this sentiment, put it most concisely at 5:23 PM on January 26th, "Jan25 proved you wrong. Revolution can be a Facebook event that is liked, shared & tweeted."

Monday, February 7, 2011

Have You Heard of Anonymous?

Pleased to bring you another guest post by blog-star Benjamin Templeton.

- 44 Maagnum




“Have you ever heard of Anonymous?” Sounds like a dumb question, not the title of an entry in a blog. But a group of so-called “hacktivists” (so-called by somebody who has clearly never been enlightened here at Carpe Daemon) who style themselves exactly that are gaining headlines and notoriety around the world.

In December, the Anonymous group launched a “Distributed Denial-of-Service” attack on several banks and other services (I won’t go into the technical details. Stephen Colbert handles that quite well in this 7-minute bit, which I highly recommend). This attack was termed “Operation Payback”, and was launched in response to the deactivation of financial accounts belonging to controversial Wikileaks founder Julian Assange by credit card companies they received government pressure. The websites of MasterCard and Visa both came down on December 8th.
Pensive Julian Assange


More recently, Anonymous attacked Egypt and various other countries in the Middle-East, as the BBC reported. The message is clear: Anonymous is dedicated to the freedom of information, and anybody who messes with that will face the wrath of the Internet.


Mr. Assange’s dedication to transparency is clear and his sincerity admirable. Wikileaks exposes government secrets indiscriminately without clear bias. Moreover, Mr. Assange has never been secretive about his identity; from the time that Wikileaks became a high-profile issue, he was linked to it (which is extraordinary because, by the nature of his activity, almost every government probably hates him).


But is Anonymous really about the noble ideal of free speech, or is it something else? Is Visa likely to reverse its decision (a decision in which they very possibly had little or no real choice) regarding Mr. Assange because its website goes down for a few hours? And who exactly is harmed (or at least inconvenienced) by an attack like that? The answers to those questions make Anonymous seem more like a cyber bully than a democratic paragon. After all, it seems unlikely that the affected Visa and MasterCard customers were responsible for Assange’s legal troubles or even felt particularly strongly about his endeavors.


Anonymous is a group based around the message board “4chan”, a site founded on the premise of anonymity. Anybody can post anything without registering or sharing any information about themselves whatsoever. In this xkcd comic, we see a bit of the absurdity that a group based on the premise of freedom of information would be so secretive. From their website, they instruct members “Consider protecting your name, face and identity for safety's sake.” Who decides what information falls under the exemption of “safety’s sake”?


As somebody who thinks that the Internet is a wonderful thing, I’m pained to see groups like Anonymous abuse their expertise. There is a stereotype of the non-conformist “hacker” (see Maagnum, February 4 2011) embodied by Matthew Broderick’s character in WarGames, who could be construed as a little bit malicious, but is mostly just playing around. These characters don’t want to hurt people, they just believe that the Internet is a free place. However, this image is a bit of an illusion. Groups like Anonymous seem more focused on the power of being able to impose their will on the technological community (which, nowadays, includes almost all of us) and Free Speech is merely a convenient justification.


My point is: think twice about endorsing Internet Free Speech, or we may all be Denied of Service before too long by the cyber-megalomaniacs who claim to be protecting it.