As use of the Internet grows and changes, so has the ability of users to search for specific content or stories, photos and videos that relate to certain topics of interest. One of the companies trying to harness and expand the power of search is Cuil, which is developing Cpedia — an engine that promises less repetition, an encyclopedia-style summary for each search, results that integrate related topics, and input and recommendations from users’ social networks. Cuil vice president of business development and finance Seval Oz Ozveren talked about the company’s mission, the evolution of search and the creation of Cpedia — which is still in the “alpha” testing phase — with Knowledge at Wharton during the recent Future of Publishing conference in New York City.
An edited transcript of the conversation appears below.
Knowledge at Wharton: Tell me a little bit about Cuil, its mission, some of your current services and maybe some that are under development?
Seval Oz Ozveren: Cuil is a search engine that is differentiated from the other two search engines that crawl the worldwide web — the first being Google, the second being Microsoft — in that our mission is more about keeping the user on the [search] page. [Our goal is] enabling them to discover content — related content — and [creating a] visualization of content so that you can find things serendipitously that you didn’t necessarily know you were looking for.
The second differentiator with Cuil is that Cuil has been searching and mining the worldwide web for intersections and long-tail queries in that we think that the next generation of search ought to be about intersections and finding more specific topics that are related to each other, such as, “osteoporosis, hypertension [and] side effects.” That is also changing the way in which people use search because it is enabling the user to find more specific information about topics that they are querying.
Knowledge at Wharton: How is a contextual search different than if someone were to visit a traditional search engine, type in those three terms you just mentioned and get back a list of links? What’s the difference between that and a more contextual search?
Ozveren: When you type in right now “osteoporosis, hypertension [and] side effects” what Cuil gives you, or tries to give you, is content that you can find on the Web that is percolating to the top, but also related [content, which appears] on the right hand side [of the results page]…. [Contextual search] is categories that come up, maybe the actual nomenclature for the particular disease, maybe specific drugs that are attributed to that disease, maybe side effects in medications. The next generation of what we are doing is creating your social network inputs for that, so that if you are connected to Facebook, for example, you get your friends’ or colleagues’ or peers’ views on those topics or related topics. It may not pick up — someone might not have said something specifically using the word “hypertension,” but they may have said “high blood pressure.” So our ability to mine the deep down Web enables us to draw inferences between the pages that exist on similar concepts and understand that those similar concepts are related. And I think the third part of the contextual search is really peer recommendations, which comes from having integrated search results from your social networks.
Knowledge at Wharton: So contextual search kind of bridges the gap between thinking about a topic and not being able to type the right words into a search engine to get results related specifically to that topic?
Ozveren: Exactly. And [discovering that] “Wow. I didn’t know that there was a bad side effect [from] a particular drug when I took it for hypertension that might be [causing symptoms] like obesity and hair growth.” So a lot of information that you may not expect, but serendipitously find there related to your query.
Knowledge at Wharton: How do you think the question of users’ online identity creates opportunities for a search? How does it create limitations?
Ozveren: I think reputation isn’t going to exist anymore on the Web. I think we are talking about a level playing field pretty much. Every piece of information that is out there on you is susceptible to being recorded and Tweeted about and coming up [in a search]. But I do think that it is important that people claim identities on the Web going forward. In as much as the Web tries to remain agnostic, you also need to try and claim who you are. And it really is a powerful tool for the individual because there is so much information out there. Your ability to [bring to the] surface information about yourself or your friends or your reputation or the work that you are doing is important to being able to control that channel.
Cpedia — I’ll segway into that because it is a very powerful tool — is [Cuil’s] next or third page of search … Now we are talking about content that is surfacing automatically through algorithms that generate Web-based content. So it is a summary engine. It is almost like a Wikipedia but the difference between Wikipedia and Cpedia is that [the results are] not user generated. It is allowing other points of reference, which Wikipedia doesn’t do. Also, right now if you go to the Web and you search someone like me; I don’t have a Wikipedia page. There are only 200,000 active Wikipedia pages out there. That leaves a huge [opportunity to create profiles] for everyone else out there who you can’t specifically read something about. Let’s say you are going into a [meeting] and you want to read about someone that you are being interviewed by, or talking to. Unless you go to their LinkedIn profile or their Facebook profile, if you are in their network, there is very little information pertaining to them.
But this automatically generated summary [from] Cpedia enables you to find at a blink’s notice all the information out there on a particular person. If [results that come up are] not the right person, [Cpedia] tells you it is not the right person. It disambiguates between that person and people with similar names so that you can find the right person. And that’s also a problem with the Web right now, that there is so much information out there that people don’t really know, “Is that the same Seval Oz that lived in Wilmington, Delaware in 1986?” Well, Cpedia, because it has information access to all that data, tells you that it is. And, in fact, we are finding that enterprise search is a very interesting part of this because now people, including credit report companies and the United States government, are starting to want to have background information on people. So there is other utility that is coming out from being able to mine this data and present this data.
Knowledge at Wharton: Is Cuil’s Cpedia product available now, in beta form or for broader use?
Ozveren: Cpedia is a product of Cuil’s, that is kind of [expanding on] what we have always been doing, which is data mining. We crawl a billion pages a day on a 120-billion page index. That gives us a huge scale of ability of being able to manage this data. It is a pretty powerful tool to be able to just algorithmically create or automatically generate summary pages.
[Right now] we call it alpha Cpedia because it is version one and like any other version one, it leaves something to be desired in terms of polish…. It is a very, very difficult problem that we are trying to solve, so it takes time. It is not something that is clearly going to be out there as a branded ready product in two months. It is going to take time as people use it and come back with feedback. It works better in some verticals than it does in others. Like its people search is better than product search…. There is a lot of information on the Web that doesn’t necessarily surface to the top because maybe people don’t want it to. I’m not throwing any jabs at any particular large U.S. conglomerates, but we have a right to know as the public if there are side effects with certain drugs that are being talked about, or if there are certain problems with certain car manufacturers. We want to engage in that conversation. We want to be a part of it. We want to watch what thought leaders are saying about that. We want to somehow feel a part of the experience.
That was the beauty of Facebook. Facebook really created the user experience. People went on to Facebook [because] they just wanted to watch what their friends were doing, feel a part of it and feel connected. And I think the Web going forward is about feeling connected. People use the word “engagement” and people are trying to monetize engagement, but I like to think of it in terms of not necessarily monetization but the connectability of every individual to another.
Knowledge at Wharton: What do you think the evolution of Internet search has shown publishers and companies about what people want to get out of these services? Were there some assumptions about users’ habits that turned out to be incorrect?
Ozveren: I think people are finding the power of groundswell movements, especially in America. Every large civil movement in this country has been a grass roots groundswell — even the election of our President. And if you think about the election of our President, it was mainly done through the Web. His campaign was completely Web-centric…. I also think there ought to be other applications like non-profit foundation search, so you can actually go and search on a search engine that relates to [environmental causes], that relates to the things that you believe in. One of our comments … has always been “Let’s create a blue and a red search engine for Republicans and for Democrats.” Then the Democrats can find things that are related to their vision, and Republicans can find things related to their views. I see that potentially happening in the year 2020. So, yes, absolutely search is a powerful tool that people can aggregate ideas — like-minded ideas — around. But it also has danger in the fact that if it is not used carefully, if it is not used with a bit of concern over privacy, it can be misused. We need to have controls — regulatory controls — and I think you will see Facebook and Google and search engines like ourselves grappling with these tools and making sure that we are not infringing upon people’s privacy.
For more coverage of The Future of Publishing Conference:
Will Newspaper Readers Pay the Freight for Survival? Knowledge at Wharton
Changing Times at The Washington Post: Engaging Readers, Enhancing Content
Upended by eBooks: Is This the Last Chapter for the Book Business?