Hacker Newsnew | past | comments | ask | show | jobs | submit | 2007-02-25login
Stories from February 25, 2007
Go back a day. Go forward a day, month, or year.

Yahoo has been dying ever since Google took off.

Since the NYT demands your info:

THE World Wide Web is awash in digital video, but too often we can’t find the videos we want or browse for what we might like.

That’s a loss, because if we could search for Internet videos, they might become the content of a global television station, just as the Web’s hypertext, once it was organized and tamed by search, became the stuff of a universal library.

What we need, says Suranga Chandratillake, a co-founder of Blinkx, a start-up in San Francisco, is a remote control for the Web’s videos, a kind of electronic TV Guide. He’s got just the thing.

Videos have multiplied on social networks like YouTube and MySpace as well as on news and entertainment sites because of the emergence of video-sharing, user-generated video, free digital storage and broadband and Wi-Fi networks.

Today, owing to the proliferation of large video files, video accounts for more than 60 percent of the traffic on the Internet, according to CacheLogic, a company in Cambridge, England, that sells “media delivery systems” to Internet service providers. “I imagine that within two years it will be 98 percent,” says Hui Zhang, a computer scientist at Carnegie Mellon University in Pittsburgh.

But search engines — like Google — that were developed during the first, text-based era of the Web do a poor job of searching through this rising sea of video. That’s because they don’t search the videos themselves, but rather things associated with them, including the text of a Web page, the “metadata” that computers use to display or understand pages (like keywords or the semantic tags that describe different content), video-file suffixes (like .mpeg or .avi), or captions or subtitles.

None of these methods are very satisfactory. Many Internet videos have little or obscure text, and clips often have no or misleading metadata. Modern video players do not reveal video-file suffixes, and captions and subtitles imperfectly capture the spoken words in a video.

The difficulties of knowing which videos are where challenge the growth of Internet video. “If there are going to be hundreds of millions of hours of video content online,” Mr. Chandratillake said, “we need to have an efficient, scalable way to search through it.”

Mr. Chandratillake’s history is unusual for Silicon Valley. He was born in Sri Lanka in 1977 and divided his childhood among England and various countries in South Asia where his father, a professor of nuclear chemistry, worked. Then he studied distributed processing at Kings College, Cambridge, before becoming the chief technology officer of Autonomy, a company that specializes in something called “meaning-based computing.” This background possibly suggested an original approach to search when he founded Blinkx in 2004.

Mr. Chandratillake’s solution does not reject any existing video search methods, but supplements them by transcribing the words uttered in a video, and searching them. This is an achievement: effective speech recognition is a “nontrivial problem,” in the language of computer scientists.

Blinkx’s speech-recognition technology employs neural networks and machine learning using “hidden Markov models,” a method of statistical analysis in which the hidden characteristics of a thing are guessed from what is known.

Mr. Chandratillake calls this method “contextual search,” and he says it works so well because the meanings of the sounds of speech are unclear when considered by themselves. “Consider the phrase ‘recognize speech,’ ” he wrote in an e-mail message. “Its phonemes (‘rek-un-nise-peach’) are incredibly similar to those contained in the phrase ‘wreck a nice beach.’ Our systems use our knowledge of which words typically appear in which contexts and everything we know about a given clip to improve our ability to guess what each phoneme actually means.”

While neural networks and machine learning are not new, their application to video search is unique to Blinkx, and very clever.

How good is blinkx search? When you visit blinkx.com, the first thing you see is the “video wall,” 25 small, shimmering tiles, each displaying a popular video clip, indexed that hour. (The wall provides a powerful sense of the collective mind of our popular culture.)

To experiment, I typed in the phrase “Chronic — WHAT — cles of Narnia,” the shout-out in the “Saturday Night Live” digital short called “Lazy Sunday,” a rap parody of two New York slackers. I wanted a phrase that a Web surfer would know more readily than the real title of a video. I also knew that “Lazy Sunday,” for all its cultish fame, would be hard to find: NBC Universal had freely released the rap parody on the Internet after broadcasting it in December 2005, but last month the company insisted that YouTube pull it.

Nonetheless, Blinkx found eight instances of “Lazy Sunday” when I tried it last week. By contrast, Google Video found none. Typing “Lazy Sunday” into the keyword search box on Google’s home page produced hundreds of results — but many were commentaries about the video, and many had nothing to do with “Saturday Night Live.”

Blinkx, which has raised more than $12.5 million from angel investors, earns money by licensing its technology to other sites. Although Blinkx has more than 80 such partners, including Microsoft, Playboy, Reuters and MTV, it rarely discloses the terms of its deals. Mr. Chandratillake said some licensees pay Blilnkx directly while others share revenue and some do both. Blinkx has revealed the details of one deal: ITN, a British news broadcaster, will share the revenue generated by advertising inserted in its videos.

For all of Blinkx’s level coolness, there are at least three obvious obstacles to the company’s success.

First, because Google Video is not much good now doesn’t mean it won’t get better: after all, when Blinkx was founded, it first applied machine learning to searching the desktops of personal computers, a project that was abandoned when Google and Microsoft released their own desktop search bars.

Second, even if Google improbably fails to develop effective video search, the field will still be crowded: TruVeo, Flurl, ClipBlast and other start-ups are all at work on different subsets of the market.

Finally, Blinkx might not go far enough in searching the content of videos: the company searches their sounds, but not their images.

THIS last objection is the most serious.

“Because Blinkx emphasizes speech recognition, there is a great amount of multimedia content that they cannot address, like photographs,” said John R. Smith, a senior manager in the intelligent information management department of I.B.M.’s T. J. Watson Research Center in Hawthorne, N.Y. “But what’s worse, speech is not a very good indicator of what’s being shown in a video.”

Mr. Smith says he has been working on an experimental video search engine called Marvel, which also uses machine learning but organizes visual information as well as speech.

Still, at least for now, Blinkx leads video search: it searches more than seven million hours of video and is the largest repository of digital video on the Web.

“Search is our navigation, our interface to the Internet,” said John Battelle, chief of Federated Media Publishing and author of “The Search,” an account of the rise of Google. With Blinkx, we may have such an interface for digital video, and be a little closer to Mr. Chandratillake’s vision of a universal remote control.

Jason Pontin is the editor in chief and publisher of Technology Review, a magazine and Web site owned by M.I.T. E-mail: pontin@nytimes.com.


This problem is firefox specific. For whatever reason, IE does The Right Thing. Actually, the whole site just looks better in IE, so maybe this should be a request for better Firefox support :)
34.Wireless: India's Hot, China's Not (redherring.com)
3 points by python_kiss on Feb 25, 2007

50 percent of their traffic is for their mail service at this point, according to Alexa. If they ever get beaten on that, it'll be lights out for them.

See also The Bootstrapper's Bible by Seth Godin. Seth has a much better explanation of the benefits of bootstrapping than anything else I've seen. The eBook version is only three bucks on Amazon too.
37."Remember Me" Feature Would Be Nice
3 points by awt on Feb 25, 2007 | 2 comments
38.Startup School Wiki - Wiki inc videos of startup school 2006 and 2005 (infogami.com)
3 points by danw on Feb 25, 2007 | 1 comment

50 million pounds? That has to be a typo.

I find myself marking up comments of the same 2 or 3 users more often than others. They don't have ultra-high karma or anything- they just are interested in the same articles and discussions I am. It would be nice to learn more about them.

A lot of you would-be entrepreneurs should give this a read. There are many things to consider in the crucial first steps.

I pointed out in an earlier article (http://m4th.com/Articles/Article.php?Article-Title=Anatomy-of-a-Successful-Social-Network) that MySpace owes much of its success to the countless choices it offers to its users. Over the past couple of months, however, MySpace has turned greedy. Rupert Murdoch feels that online widgets are a zero-sum game; in other words, widget companies make profit at the expense of MySpace. This couldn't be further from the truth; the fact is that widgets complement MySpace by giving its users the choice to decorate their pages anyway they want. By restricting access to these widgets, MySpace will not only frustrate the users but also generate unprecedented negative publicity. - Jawad Shuaib
43.The Battle for Mobile Search (businessweek.com)
2 points by python_kiss on Feb 25, 2007

All I know is that it works. I tried out a few terms and got what I had in mind every time.

They heavily emphasize speech recognition, I think. For what this is, it's very cool. The technology is there and the product works. I think this is going places.


As far as I can see, there are two tar pits that Digg and now Reddit are stuck in:

1. A lack of focus and quality in the content. 2. No troll guards.

1. Lack of focus and quality In my experience, users frequent a site because it has quality content and they leave when the quality of the content declines. Digg and more recently Reddit, are experiencing a loss of focus and quality and as a result are losing their initial users. Digg’s quality is so bad it is now pointless to read and much to my chagrin, Reddit seems to be following suit. Reddit seems to be drowning in a rising tide of noobs. Apparently, there aren’t enough old users around to down-vote the crap posted by the noobal hoard. From a quick read of comments, it seems many long-time users are angry and feel disenfranchised. It’s because of this that those users whose content made Digg and Reddit popular in the first place are now leaving those sites and taking their great ideas with them.

2. No troll guards: Nothing poisons an online community quicker than a few nasty trolls. Another one of the reasons that I’m pulling away from Reddit is because it is getting mean. Both the links that are posted and the article forums are being destroyed by trolls stomping around unchecked. I hope Reddit can fix this problem. If not, I’m going to stop spending my time there.

The impression that I get, Paul, is that your goal is to make this YC News a start-up news site and a community of potential founders; not simply another social news site. The only way that I can see to maintain quality content and to filter out the trolls is to institute some form of moderation. Straight democracy leads to anarchy; that’s why I think a news site needs to be a republic. I don’t think, by any stretch of the imagination, that Slashdot is perfect, but they do have a system where moderators are selected from heavy and moderate users on a rotating basis. The system filters out new and spam accounts and gives preference to high karma users. It seems to keep the trolls in check. It also encourages people to take more ownership and to participate in the community.

Slashdot’s FAQ explains their moderation system here: http://slashdot.org/faq/com-mod.shtml#cm520

There is also a brief discussion of their anti-troll rules here: http://slashdot.org/faq/com-mod.shtml#cm2000

Thanks for setting up the site. It scratches an itch that I’ve had for a while.


>[edit: handily, though, there is an option to edit the title of a submission]

well, there's one improvement over reddit...


People trying to delete their posts. Erasing the text == deleting the post.
48.How to handle security bugs in your code (acmqueue.com)
2 points by ashu on Feb 25, 2007

Couldn't they just put Adsense on the sidebar to monetise it?

option to open articles/comments in a new tab by default?

bookmarklet set (I often use my "reddit this" bookmarklet to get back to the comments page for an article I've clicked)


Of course it'd be ideal to create a company with a sustaining business model; there's no question about that.

Is that the only time a company should be formed? That may be akin to only purchasing shares in long-term growth companies. Sometimes it's in your best interest to just ride the short-term explosion and move on.

It's completely viable to get into something with only a foreseeable immediate market. After a certain point, perhaps the company would be better off taking advantage of the economies of scale. It isn't a problem to be true to yourself and realize early on that an acquisition is the best end-target for your new company.


I can see this resulting in mass abuse very soon.
53.Jubii To Launch 10GB Webmail and File Sharing (mashable.com)
2 points by Harj on Feb 25, 2007

Yeah, it's probably not so applicable to consumer facing internet companies since the approach really needs a business to buy into the idea (much more willing to preorder, a single sale could fund the entire development).

Although maybe that depends on what you are creating. I could see something like Hotmail or Skype being bootstrapped with this method, because those services are useful to both individuals and businesses. Back then, email from any PC with the internet or free international voice calls would have been things that many businesses would have paid for (and still would, if there weren't so many free alternatives now). Even Reddit got its NYT deal soon after starting which pushed it into the black, did it not? (not rhetorical - I don't know much about their history so tell me if I'm wrong!)

There is the problem of loss of focus though. If a company like Hotmail started in '95, realised it could make millions selling their product to companies and focused on that (sales, turnkey servers for easy installation), they would probably only get a few years of that income at most before they are dethroned by a company that just focused on making the best web based email possible. So it would be a bit of a local maxmimum. Is this more dangerous than the loss of focus of doing something totally unhelpful/unrelated to the main product to bring in some cash (contracting, searching for investors) while simultaneously working on the final product? I'd like to know what everyone thinks.

55.A quick place to list your startup and see others (webstartup.info)
2 points by ratlaw on Feb 25, 2007 | 1 comment

So i'm not the only young one pursuing a startup. Good to know.

Good luck.


It was bound to happen. All the known ones will probably have to pay some sort of a toll. I wonder if the community will care enough to fight back?
58.Yahoo! Announces Hosting of YUI libraries on their edge network (ajaxian.com)
2 points by bradn on Feb 25, 2007

Am debugging it now.

I did a search and didn't see this mentioned.

In any case, a very useful feature would be a way to track your comments in the different submissions and the stories that you voted up.

There are a lot of really great stories on here and sometimes I don't have the time to finish reading some. I'd like to be able to find the stories again quickly in my recent history.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: