Search logs are records of the user requests for information from search engine. The search engine companies like Google, bing, yahoo will store the database of intentions, the search histories of their users. Search engine companies collect the database of search logs. These search logs are very useful for researchers and it gives big profit to the researchers. However, search engine companies are wary of publishing the search logs in order not to share the very sensitive information of their users.
In this paper, we are examining the algorithms for publishing the frequent queries, clicks and keywords of a search log. At first, we will explain how the methods that achieve variants of k-anonymity are unprotected to the active attacks. Then we will explain that the stronger guarantee ensured by the epsilon-differential privacy unfortunately is unable to provide any solution to this problem. Later, we proposed a novel algorithm ZEALOUS and explained how to set its parameters to achieve the (delta, epsilon) probabilistic privacy.
We contrasted an analysis by Korolova et al that achieves (epsilon, delta)-indistinguishability with our analysis of ZEALOUS. This paper ends with a large experimental study using real applications where we will compare the previous work that achieves k-anonymity in search log publishing and ZEALOUS.
Final results show that ZEALOUS achieves much stronger privacy guarantees while at the same time it also provides comparable utility to the k-anonymity.
Download projects in www.freestudentprojects.com
Buy Publishing Search Logs project: