Theses @ NLP Group

The NLP Group is continuously looking for students who would like write their bachelor's or master's thesis in the area of natural language processing, possibly with connections to information retrieval and general artificial intelligence.

Topics

All thesis topics should be related to the main research directions of the NLP Group, which include computational argumentation, computational sociolinguistics, and computational explanation

Below, we provide a selection of currently available topics. Details of the topics are discussed and shaped jointly in the beginning of the thesis process. Other topics are possible, including own ideas from the student's side, if they go hand in hand with our research interests.

  • Towards Conversational Data Annotation: Investigating the role of explanation generation on annotation agreement

    Existing labeling procedures treat labeling as multiple choice answering questions with a list of radio buttons to choose from. In this way, current labeling protocols treat labeling errors similarly and do not distinguish between errors, misunderstandings, and cultural and personal differences. Large language models show excellent potential to improve annotation guideline procedures because of their conversational nature and their ability to learn and follow instructions. An open research direction is how LLMs can support annotators in tackling difficult annotation tasks. In this thesis, we will investigate a specific setup where an LLM will assist annotators by answering their questions and actively learning how to explain the annotations provided by the annotators. The explanations will be first generated by the annotators in the first iteration. In this setup, users have to provide natural instance-specific explanations for their labels. An LLM will learn these explanations and in the next iteration, it will provide the user with possible instance-specific explanations to choose from. When an LLM detects possible online label errors it will provide possible explanations and counterarguments to nudge the annotator toward more careful consideration of their choice.

    Supervisor: Dr. Ajjour

  • What makes an Argument In/-Appropriate?

    The question of what makes an argument in-/appropriate is a complex one. It is not only a question of the content of the argument, but also of the context in which it is made and the style in which it is presented. Appropriateness as a sub-dimension of argument quality has been systematically explored in the literature on computational argumentation. However, a linguistic perspective on appropriateness features and their interplay with argument quality is still missing. This thesis aims to fill this gap by a computational extraction and anlaysis of linguistic appropriateness features in argumentative texts.

    Advisor: Timon Ziegenbein

  • Explaining Essay Scores for Quality-oriented Argumentative Writing Support

    Learning argumentative writing is challenging. Besides writing fundamentals such as syntax and grammar, learners must select and arrange argument components meaningfully to create high-quality essays. One step to support argumentative writing computationally is to mine the argumentative structure. When combined with automatic essay scoring, one can exploit interactions of the argumentative structure and quality scores for comprehensive writing support. Our analyses suggest that the structure annotations in our recently released corpus help score the essay quality, enabling quality-oriented argumentative writing support. However, the perceived usefulness of such a tool is still to be evaluated. This thesis can utilize our corpus for such writing support by analyzing which exact argumentative structures influence the essay quality and to what extent. We expect interpretable essay quality scoring based on the structure to generate helpful insights that can be used as writing feedback by school students.

    Advisor: Maja Stahl

Working on the outlined and similar topics involves dealing with state-of-the-art technologies such as neural transformers, contrastive learning, multitask learning, and/or various others. Most topics target the development and empirical evaluation of NLP methods for specific tasks.
 

Interested?

Candidates should have very good programming skills (preferably in Python) as well as some experience with machine learning and other AI methods (ideally with NLP). You should be enrolled in one of the computer science programs at Leibniz University Hannover.

In case you are interested in a specific topic, please send a mail to the advisor of that topic, including information about the prior knowledge and experience have:

  • What relevant courses did you take?
  • What experience with AI development and evaluation do you have?
  • What other relevant knowledge do you have?

In case you are unsure about the topic, but interested in writing your thesis with the NLP Group, please send a mail to the head of the group.

Evaluation

The grading of a thesis is based on a weighted grades for two parts: 

  • The developed solution to the problem tackled in thesis (45%)
  • The written thesis presenting the solution (55%)

The grading of the developed solution takes five criteria into account:

  • Difficulty / Complexity. How difficult was it to develop the solution? How much effort was put into it? Is the complexity justified? ... 
  • Technical quality. Is the design and realization of the solution well-made? Are the experiments systematic and scientifically sound? ...
  • Novelty and own ideas. Does the solution have scientific novelty? Have own ideas been developed and realized in the solution? ...
  • Impact / Publishability. Does the solution improve the state of the art? Are the results worth publishing? Can they be published as is? ...
  • Implementation and data. How easy is it to read and reuse the code? If data has been created, is it well-organized? Are they well-documented? ...

The grading of the written thesis takes six criteria into account:

  • Abstract, introduction, and conclusion. Are problem, solution, and results well-introduced? Are the right conclusions made? Is the whole story told? ... 
  • Background and related work. Are basics well-described and relevant? Is the connection to the thesis clear? Is the state of the art well-discussed? ...
  • Approaches and data. Is the presentation of the developed approaches and data clear, complete, and on the right technical level? ...
  • Experiments, evaluation, and discussion. Are the experiments described systematically? Are the results clearly presented and correctly interpreted? ...
  • Form, layout, and style. Is the structure convincing? Is the writing clear and error-free? Do tables and figures support it? Are citations correct? ...
  • Scientific quality. Does the thesis adhere to scientific standards? Does the presentation follow community principles? …

Past Theses (as of Winter 2022)

  • Evaluating Data-Driven Approaches to Improve Word Lists for Measuring Social Bias in Word Embeddings. Master's thesis, Vinay Kaundinya Ronur Prakash, UPB.
  • Audience Aware Counterargument Generation. Master's thesis. Mahammad Namazov, 2023, UPB.
  • Improving Learners’ Arguments by Detecting and Generating Missing Argument Components. Master's thesis, Nick Düsterhus, 2023, UPB.
  • Gender-inclusive Coreference Resolution using Pronoun Preference. Master's thesis, Jan-Luca Hansel, 2023, UPB.
  • Dialect-aware Social Bias Detection using Ensemble and Multi-Task Learning. Master's thesis, Sai Nikhil Menon, 2022, UPB.
  • Counter Argument Generation Using a Knowledge Graph. Master's thesis, Indranil Ghosh, 2022, UPB.
  • Domain-aware Text Professionalization using Sequence-to-Sequence Neural Networks. Bachelor's thesis, Juela Palushi, 2022, UPB.

Past Theses (Summer 2018 – Summer 2022)

  • Detection and Mitigation of Subjective Bias in Argumentative Text. Master's thesis, Sambit Mallick, 2022, UPB.
  • Cross-domain analysis of argument quality and its connection to offensive language. Bachelor's thesis, Patrick Bollmann, 2022, UPB.
  • Cross-domain Aspect-based Sentiment Analysis with Multimodal Sources.Master's thesis, Pavan Kumar Sheshanarayana, 2022, UPB.
  • Comparative Evaluation of Automatic Summarization Techniques for German Court Decision Documents. Master's thesis, Josua Köhler, 2022, UPB.
  • Computational Analysis of Cultural Differences in Learner Argumentation.Master's thesis, Garima Mudgal, 2022, UPB.
  • Propaganda Technique Detection Using Connotation Frames. Master's thesis, Vinaykumar Budanurmath, 2022, UPB.
  • Contrastive Argument Summarization using Supervised and Unsupervised Learning. Master's thesis, Jonas Rieskamp, 2022, UPB.
  • Mitigation of Gender Bias in Text using Unsupervised Controllable Rewriting.Master's thesis, Maja Brinkmann, 2021, UPB.
  • Assessing Stereotypical Social Biases in Text Sequences using Language. Master's thesis, Meher Vivek Dheram, 2021, UPB.
  • Modeling Context and Argumentativeness of Sentences in Argument Snippet Generation. Master's thesis, Harsh Shah, 2021, UPB.
  • Political Speaker Transfer: Learning to Generate Text in the Styles of Barack Obama and Donald Trump. Master's thesis, Jonas Bülling, 2021, UPB.
  • Quantifying Social Biases in News Articles with Word Embeddings. Bachelor's thesis, Maximilian Keiff, 2021, UPB.
  • Computational Text Professionalization using Neural Sequence-to-Sequence Models. Master's thesis, Avishek Mishra, 2021, UPB.
  • Assessing the Argument Quality of Persuasive Essays using Neural Text Generation. Master's thesis, Timon Gurcke, 2021, UPB.
  • Automatic Conclusion Generation using Neural Networks. Bachelor's thesis, Torben Zöllner, 2020, UPB.
  • Computational Analysis of Metaphors based on Word Embeddings. Bachelor's thesis,  Simon Krenzler, 2020, UPB. 
  • Semi-supervised Cleansing of Web-based Argument Corpora. Bachelor's thesis, Jonas Dorsch, 2020, BUW.
  • Countering Natural Language Arguments using Neural Sequence-to-Sequence Generation. Master's thesis, Arkajit Dhar, 2020, UPB.
  • Snippet Generation for Argument Search. Bachelor's thesis, Nick Düsterhus, 2019, UPB.
  • Argument Quality Assessment in Natural Language using Machine Learning — bachelor's thesis, Till Werner, 2019, UPB.
  • Stance Classification in Argument Search. Master's thesis, Philipp Heinisch, 2019, UPB.
  • Towards a Large-scale Causality Graph. Bachelor's thesis, Yan Scholten, 2019, UPB.

Past Theses (Summer 2009 – Winter 2017)

  • Cross-Domain Mining of Argumentation Strategies using Natural Language Processing. Master's thesis, 2017, BUW.
  • Mining Relevant Arguments at Web Scale. Master's thesis, 2017, BUW.
  • Identifying Controversial Topics in Large-Scale Social Media Data. Master's thesis, 2016, BUW.
  • Efficiency and Effectiveness of Multi-Stage Machine Learning Algorithms for Text Quality Assessment. Master's thesis, 2013, UPB.
  • An Expert System for the Automatic Construction of Information Extraction Pipelines. Master's thesis, 2012, UPB.
  • Efficiency and Effectiveness of Text Classification in Information Extraction Pipelines. Master's thesis, 2012, UPB.
  • Efficient Information Extraction for Creating Use Case Diagrams from Text. Master's thesis, 2012, UPB.
  • Heuristic Search for the Run-time Optimization of Information Extraction Pipelines. Master's thesis, 2012, UPB.
  • Aggregation and Visualization of Market Forecasts. Bachelor's thesis, 2011, UPB.
  • Branch Categorization based on Statistical Analysis of Information Retrieval Results. Bachelor's thesis 2011, UPB.
  • Evaluation of Cooperative Robot Motion Strategies in Simbad. Bachelor's thesis, 2009, UPB.

LUH: Leibniz University Hannover, UPB: Paderborn University, BUW: Bauhaus-Universität Weimar