TY  - THES
A1  - Ajjour, Yamen
T1  - Addressing Controversial Topics in Search Engines
N2  - Search engines are very good at answering queries that look for facts. Still, information needs that concern forming opinions on a controversial topic or making a decision remain a challenge for search engines. Since they are optimized to retrieve satisfying answers, search engines might emphasize a specific stance on a controversial topic in their ranking, amplifying bias in society in an undesired way. Argument retrieval systems support users in forming opinions about controversial topics by retrieving arguments for a given query. In this thesis, we address challenges in argument retrieval systems that concern integrating them in search engines, developing generalizable argument mining approaches, and enabling frame-guided delivery of arguments.

Adapting argument retrieval systems to search engines should start by identifying and analyzing information needs that look for arguments. To identify questions that look for arguments we develop a two-step annotation scheme that first identifies whether the context of a question is controversial, and if so, assigns it one of several question types: factual, method, and argumentative. Using this annotation scheme, we create a question dataset from the logs of a major search engine and use it to analyze the characteristics of argumentative questions. The analysis shows that the proportion of argumentative questions on controversial topics is substantial and that they mainly ask for reasons and predictions. The dataset is further used to develop a classifier to uniquely map questions to the question types, reaching a convincing F1-score of 0.78.

While the web offers an invaluable source of argumentative content to respond to argumentative questions, it is characterized by multiple genres (e.g., news articles and social fora). Exploiting the web as a source of arguments relies on developing argument mining approaches that generalize over genre. To this end, we approach the problem of how to extract argument units in a genre-robust way. Our experiments on argument unit segmentation show that transfer across genres is rather hard to achieve using existing sequence-to-sequence models.

Another property of text which argument mining approaches should generalize over is topic. Since new topics appear daily on which argument mining approaches are not trained, argument mining approaches should be developed in a topic-generalizable way. Towards this goal, we analyze the coverage of 31 argument corpora across topics using three topic ontologies. The analysis shows that the topics covered by existing argument corpora are biased toward a small subset of easily accessible controversial topics, hinting at the inability of existing approaches to generalize across topics. In addition to corpus construction standards, fostering topic generalizability requires a careful formulation of argument mining tasks. Same side stance classification is a reformulation of stance classification that makes it less dependent on the topic. First experiments on this task show promising results in generalizing across topics.

To be effective at persuading their audience, users of an argument retrieval system should select arguments from the retrieved results based on what frame they emphasize of a controversial topic. An open challenge is to develop an approach to identify the frames of an argument. To this end, we define a frame as a subset of arguments that share an aspect. We operationalize this model via an approach that identifies and removes the topic of arguments before clustering them into frames. We evaluate the approach on a dataset that covers 12,326 frames and show that identifying the topic of an argument and removing it helps to identify its frames.
KW  - Informatik
KW  - Suchmaschine
KW  - Argumentation
KW  - Internet
KW  - argumentation
KW  - controversial topics
KW  - natural language processing
KW  - search engines
Y1  - 2023
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:gbv:wim2-20230626-64037
ER  - 
TY  - THES
A1  - Riehmann, Patrick
T1  - Advanced Visual Interfaces for Informed Decision-Making
N2  - This thesis presents new interactive visualization techniques and systems intended to support users with real-world decisions such as selecting a product from a large variety of similar offerings, finding appropriate wording as a non-native speaker, and assessing an alleged case of plagiarism.  


The Product Explorer is a significantly improved interactive Parallel Coordinates display for facilitating the product selection process in cases where many attributes and numerous alternatives have to be considered.  A novel visual representation for categorical and ordered data with only few occurring values, the so-called extended areas, in combination with cubic curves for connecting the parallel axes, are crucial for providing an effective overview of the entire dataset and to facilitate the tracing of individual products. The visual query interface supports users in quickly narrowing down the product search to a small subset or even a single product. The scalability of the approach towards a large number of attributes and products is enhanced by the possibility of setting some constraints on final attributes and, therefore, reducing the number of considered attributes and data items. Furthermore, an attribute repository allows users to focus on the most important attributes at first and to bring in additional criteria for product selection later in the decision process.  A user study confirmed that the Product Explorer is indeed an excellent tool for its intended purpose for casual users.   


The Wordgraph is a layered graph visualization for the interactive exploration of search results for complex keywords-in-context queries. The system relies on the Netspeak web service and is designed to support non-native speakers in finding customary phrases. Uncertainties about the commonness of phrases are expressed with the help of wildcard-based queries. The visualization presents the alternatives for the wildcards in a multi-column layout: one column per wildcard with the other query fragments in between. The Wordgraph visualization displays the sorted results for all wildcards at once by appropriately arranging the words of each column. A user study confirmed that this is a significant advantage over simple textual result lists. Furthermore, visual interfaces to filter, navigate, and expand the graph allow interactive refinement and expansion of wildcard-containing queries. 


Furthermore, this thesis presents an advanced visual analysis tool for assessing and presenting alleged cases of plagiarism and provides a three-level approach for exploring the so-called finding spots in their context. The overview shows the relationship of the entire suspicious document to the set of source documents. An intermediate glyph-based view reveals the structural and textual differences and similarities of a set of finding spots and their corresponding source text fragments. Eventually, the actual fragments of the finding spot can be shown in a side-by-side view with a novel structured wrapping of both the source, as well as the suspicious text. The three different levels of detail are tied together by versatile navigation and selection operations. Reviews with plagiarism experts confirm that this tool can effectively support their workflow and provides a significant improvement over existing static visualizations for assessing and presenting plagiarism cases.


The three main contributions of this research have a lot in common aside from being carefully designed and scientifically grounded solutions to real-world decision problems. The first two visualizations facilitate the decision for a single possibility out of many alternatives, whereas the latter ones deal with text at varying levels of detail. All visual representations are clearly structured based on horizontal and vertical layers contained in a single view and they all employ edges for depicting the most important relationships between attributes, words, or different levels of detail. A detailed analysis considering the context of the established decision-making literature reveals that important steps of common decision models are well-supported by the three visualization systems presented in this thesis.
KW  - Informatik
KW  - Visualisierung
KW  - Information Visualization
KW  - Preferential Choice
KW  - Text-based Visualization
KW  - Plagiarism Visualization
KW  - Product Search
Y1  - 2015
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:gbv:wim2-20150907-24542
PB  - Patrick Riehmann
ER  - 
TY  - THES
A1  - Kaltenbrunner, Martin
T1  - An Abstraction Framework for Tangible Interactive Surfaces
N2  - This cumulative dissertation discusses - by the example of four subsequent publications - the various layers of a tangible interaction framework, which has been developed in conjunction with an electronic musical instrument with a tabletop tangible user interface. Based on the experiences that have been collected during the design and implementation of that particular musical application, this research mainly concentrates on the definition of a general-purpose abstraction model for the encapsulation of physical interface components that are commonly employed in the context of an interactive surface environment. Along with a detailed description of the underlying abstraction model, this dissertation also describes an actual implementation in the form of a detailed protocol syntax, which constitutes the common element of a distributed architecture for the construction of surface-based tangible user interfaces. The initial implementation of the presented abstraction model within an actual application toolkit is comprised of the TUIO protocol and the related computer-vision based object and multi-touch tracking software reacTIVision, along with its principal application within the Reactable synthesizer. The dissertation concludes with an evaluation and extension of the initial TUIO model, by presenting TUIO2 - a next generation abstraction model designed for a more comprehensive range of tangible interaction platforms and related application scenarios.
KW  - Informatik
KW  - Human Computer Interaction
KW  - Benutzeroberfläche
KW  - Open Source
KW  - Mensch-Maschine-Kommunikation
KW  - Tangible User Interfaces
KW  - Protocols
Y1  - 2018
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:gbv:wim2-20180205-37178
ER  - 
TY  - THES
A1  - Kiesel, Johannes
T1  - Harnessing Web Archives to Tackle Selected Societal Challenges
N2  - With the growing importance of the World Wide Web, the major challenges our society faces are also increasingly affecting the digital areas of our lives. Some of the associated problems can be addressed by computer science, and some of these specifically by data-driven research. To do so, however, requires to solve open issues related to archive quality and the large volume and variety of the data contained.

This dissertation contributes data, algorithms, and concepts towards leveraging the big data and temporal provenance capabilities of web archives to tackle societal challenges. We selected three such challenges that highlight the central issues of archive quality, data volume, and data variety, respectively:
(1) For the preservation of digital culture, this thesis investigates and improves the automatic quality assurance of the web page archiving process, as well as the further processing of the resulting archive data for automatic analysis.
(2) For the critical assessment of information, this thesis examines large datasets of Wikipedia and news articles and presents new methods for automatically determining quality and bias.
(3) For digital security and privacy, this thesis exploits the variety of content on the web to quantify the security of mnemonic passwords and analyzes the privacy-aware re-finding of the various seen content through private web archives.
N2  - Mit der wachsenden Bedeutung des World Wide Webs betreffen die großen Herausforderungen unserer Gesellschaft zunehmend auch die digitalen Bereiche unseres Lebens. Einige der zugehörigen Probleme können durch die Informatik, und einige von diesen speziell durch datengetriebene Forschung, angegangen werden. Dazu müssen jedoch offene Fragen im Zusammenhang mit der Qualität der Archive und der großen Menge und Vielfalt der enthaltenen Daten gelöst werden. 

Diese Dissertation trägt mit Daten, Algorithmen und Konzepten dazu bei, die große Datenmenge und temporale Protokollierung von Web-Archiven zu nutzen, um gesellschaftliche Herausforderungen zu bewältigen. Wir haben drei solcher Herausforderungen ausgewählt, die die zentralen Probleme der Archivqualität, des Datenvolumens und der Datenvielfalt hervorheben:
(1) Für die Bewahrung der digitalen Kultur untersucht und verbessert diese Arbeit die automatische Qualitätsbestimmung einer Webseiten-Archivierung, sowie die weitere Aufbereitung der dabei entstehenden Archivdaten für automatische Auswertungen.
(2) Für die kritische Bewertung von Information untersucht diese Arbeit große Datensätze an Wikipedia- und Nachrichtenartikeln und stellt neue Verfahren zur Bestimmung der Qualität und Einseitigkeit/Parteilichkeit vor.
(3) Für die digitale Sicherheit und den Datenschutz nutzt diese Arbeit die Vielfalt der Inhalte im Internet, um die Sicherheit von mnemonischen Passwörtern zu quantifizieren, und analysiert das datenschutzbewusste Wiederauffinden der verschiedenen gesehenen Inhalte mit Hilfe von privaten Web-Archiven.
KW  - Informatik
KW  - Internet
KW  - Web archive
Y1  - 2022
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:gbv:wim2-20220622-46602
ER  -