This is a reminder that Raju Balakrishnan will be defending his PhD dissertation tomorrow (7/19 Thursday). Raju's work on deep web source selection and computational advertisements has garnered him a Yahoo! Key Scientific Challenges award (2009) as well as a WWW best poster award (2010), in addition to several archival publications and a lucrative job.
You are all cordially invited to his defense.
Trust and Profit Sensitive Ranking for the Deep Web and On-line Advertisements
Thursday, July 19, 2012
Subbarao Kambhampati, Chair
AnHai Doan (U. Wisconsin, Madison)
Ranking is of definitive importance to both usability and profitability of web information systems. While ranking of
results is crucial for the accessibility of information to the user, the ranking of online ads increases the profitability
of the search provider. The scope of my thesis includes both search and ad ranking.
I consider the emerging problem of ranking the deep web data considering trustworthiness and relevance. I
address the end-to-end deep web ranking by focusing on: (i) ranking and selection of the deep web databases
(ii) topic sensitive ranking of the sources (iii) ranking the result tuples from the selected databases. Especially,
assessing the trustworthiness and relevances of results for ranking is hard since the currently used link analysis
is inapplicable (since deep web records do not have links). I formulated a method---namely SourceRank---to
assess the trustworthiness and relevance of the sources based on the inter-source agreement. Secondly, I extend
the SourceRank to consider the topic of the agreeing sources in multi-topic environments. Further, I formulate a
ranking sensitive to trustworthiness and relevance for the individual results returned by the selected sources.
For ad ranking, I formulate a generalized ranking function---namely Click Efficiency (CE)---based on a realistic
user click model of ads and documents. The CE ranking considers hitherto ignored parameters of perceived
relevance and user dissatisfaction. CE ranking guaranteeing optimal utilities for the click model. Interestingly, I
show that the existing ad and document ranking functions are reduced forms of the CE ranking under restrictive
assumptions. Subsequently, I extend the CE ranking to include a pricing mechanism, designing a complete
auction mechanism. My analysis proves several desirable properties including revenue dominance over popular
Vickery-Clarke-Groves (VCG) auctions for the same bid vector and existence of a Nash equilibrium in pure
strategies. The equilibrium is socially optimal, and revenue equivalent to the truthful VCG equilibrium. Further, I
relax the independence assumption in CE ranking and analyze the diversity ranking problem. I show that optimal
diversity ranking is NP-Hard in general, and that a constant time approximation algorithm is not likely.