TY  - THES
A1  - Völske, Michael
T1  - Retrieval Enhancements for Task-Based Web Search
N2  - The task-based view of web search implies that retrieval should take the user perspective into account. Going beyond merely retrieving the most relevant result set for the current query, the retrieval system should aim to surface results that are actually useful to the task that motivated the query.

This dissertation explores how retrieval systems can better understand and support their users’ tasks from three main angles: First, we study and quantify search engine user behavior during complex writing tasks, and how task success and behavior are associated in such settings. Second, we investigate search engine queries formulated as questions, and explore patterns in a large query log that may help search engines to better support this increasingly prevalent interaction pattern. Third, we propose a novel approach to reranking the search result lists produced by web search engines, taking into account retrieval axioms that formally specify properties of a good ranking.
N2  - Die Task-basierte Sicht auf Websuche impliziert, dass die Benutzerperspektive berücksichtigt werden sollte. Über das bloße Abrufen der relevantesten Ergebnismenge für die aktuelle Anfrage hinaus, sollten Suchmaschinen Ergebnisse liefern, die tatsächlich für die Aufgabe (Task) nützlich sind, die diese Anfrage motiviert hat.

Diese Dissertation untersucht, wie Retrieval-Systeme die Aufgaben ihrer Benutzer besser verstehen und unterstützen können, und leistet Forschungsbeiträge unter drei Hauptaspekten: Erstens untersuchen und quantifizieren wir das Verhalten von Suchmaschinenbenutzern während komplexer Schreibaufgaben, und wie Aufgabenerfolg und Verhalten in solchen Situationen zusammenhängen. Zweitens untersuchen wir Suchmaschinenanfragen, die als Fragen formuliert sind, und untersuchen ein Suchmaschinenlog mit fast einer Milliarde solcher Anfragen auf Muster, die Suchmaschinen dabei helfen können, diesen zunehmend verbreiteten Anfragentyp besser zu unterstützen. Drittens schlagen wir einen neuen Ansatz vor, um die von Web-Suchmaschinen erstellten Suchergebnislisten neu zu sortieren, wobei Retrieval-Axiome berücksichtigt werden, die die Eigenschaften eines guten Rankings formal beschreiben.
KW  - Information Retrieval
Y1  - 2019
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:gbv:wim2-20190709-39422
ER  - 
TY  - JOUR
A1  - Völske, Michael
A1  - Gollub, Tim
A1  - Hagen, Matthias
A1  - Stein, Benno
T1  - A keyquery-based classification system for CORE
JF  - D-Lib Magazine
N2  - We apply keyquery-based taxonomy composition to compute a classification system for the CORE dataset, a shared crawl of about 850,000 scientific papers. Keyquery-based taxonomy composition can be understood as a two-phase hierarchical document clustering technique that utilizes search queries as cluster labels: In a first phase, the document collection is indexed by a reference search engine, and the documents are tagged with the search queries they are relevant—for their so-called keyqueries. In a second phase, a hierarchical clustering is formed from the keyqueries within an iterative process. We use the explicit topic model ESA as document retrieval model in order to index the CORE dataset in the reference search engine. Under the ESA retrieval model, documents are represented as vectors of similarities to Wikipedia articles; a methodology proven to be advantageous for text categorization tasks. Our paper presents the generated taxonomy and reports on quantitative properties such as document coverage and processing requirements.
KW  - Massendaten
KW  - Taxonomie
KW  - Dynamic Taxonomy Composition, Keyquery, Classification Systems, Reverted Index, Big Data Problem
Y1  - 2014
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:gbv:wim2-20170426-31662
ER  - 
TY  - JOUR
A1  - Vakkari, Pertti
A1  - Völske, Michael
A1  - Potthast, Martin
A1  - Hagen, Matthias
A1  - Stein, Benno
T1  - Predicting essay quality from search and writing behavior
JF  - Journal of Association for Information Science and Technology
N2  - Few studies have investigated how search behavior affects complex writing tasks. We analyze a dataset of 150 long essays whose authors searched the ClueWeb09 corpus for source material, while all querying, clicking, and writing activity was meticulously recorded. We model the effect of search and writing behavior on essay quality using path analysis. Since the boil-down and build-up writing strategies identified in previous research have been found to affect search behavior, we model each writing strategy separately. Our analysis shows that the search process contributes significantly to essay quality through both direct and mediated effects, while the author's writing strategy moderates this relationship. Our models explain 25–35% of the variation in essay quality through rather simple search and writing process characteristics alone, a fact that has implications on how search engines could personalize result pages for writing tasks. Authors' writing strategies and associated searching patterns differ, producing differences in essay quality. In a nutshell: essay quality improves if search and writing strategies harmonize—build-up writers benefit from focused, in-depth querying, while boil-down writers fare better with a broader and shallower querying strategy.
KW  - Information Retrieval
KW  - Textproduktion
KW  - Suchverfahren
KW  - Aufsatz
KW  - Suchverhalten
KW  - Pfadanalyse
KW  - Suchmaschine
Y1  - 2021
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:gbv:wim2-20210804-44692
UR  - https://asistdl.onlinelibrary.wiley.com/doi/10.1002/asi.24451
VL  - 2021
IS  - volume 72, issue 7
SP  - 839
EP  - 852
PB  - Wiley
CY  - Hoboken, NJ
ER  -