Python
Python is the language I work with on a daily basis. I am a strong proponent of PEP-8 and typing. I have previously published and maintained my own packages on PyPI.
I am currently a Member of Technical Staff at Cohere, where I focus on improving human evaluation for Large Language Models. Before that, I obtained my PhD from the Data Science Group at Heidelberg University. My academic work primarily focused on automatic text summarization, with a special interest in generating aspect-focused summaries. Beyond that, I do have a broader appreciation for research in applied Machine Learning for Natural Language Processing (NLP). I previously interned with Amazon Research and SAP, and helped build a search engine for lawyers at Codefy. During my PhD studies, I committed to releasing most of my research artifacts on Github. However, my biggest contributions to open source are my questions and answers on Stackoverflow, which have by now reached more than one million people worldwide.
Python is the language I work with on a daily basis. I am a strong proponent of PEP-8 and typing. I have previously published and maintained my own packages on PyPI.
This week, I have officially reached more than 1 million people on the website stackoverflow.com! I wanted to take a moment to reflect on this “achievement”, what it means for my professional career, and why I simultaneously believe that it is sheer luck (and a LOT of procrastination) that got me here.
As always, if you have any questions or comments, feel free to message me at dennis.aumiller@gmail.com or reach out on Twitter.
Two weeks ago, Cohere.ai announced their new dedicated summarization endpoint!
For someone currently doing their PhD on text summarization, this is both worrying, but obviously also a rather intriguing development: while recent advancements have been focusing on rather broadly applicable models (think, chatGPT), providing more task-specific alternatives seems to be the niche that Cohere is carving out for themselves.
Adding to the surprise of seeing a dedicated summarization endpoint is the fact that text summarization is really hard; in the last 50 years, a lot of progress has been made, but our current state-of-the-art models still suffer from annoying problems such as correctly retaining factuality of the input text.
Another problem is the actual definition of “summaries” in different domains. Methods for generating a “good” summary of a news article are generally useless when it comes to generating the summary of a court ruling, or generating radiology reports from doctor notes.
Due to the combination of these (and other) factors, there are comparatively few productive settings in which summarization is actively used. To my knowledge, the two main applications using some form of summarization right now are some news aggregators, summarizing information from multiple news articles (which primarily uses extractive methods, meaning directly copying existing sentences from the input documents), as well as the recently introduced “Document TL;DR” generator in Google Docs (the latter using a variant of their own PEGASUS neural model).
Philadelphia, USA (remote)
Sep 2023 - current
As part of the data and evaluation team, my main objectives are to improve Cohere’s primary generative model through a series of automated data transformations, as well as improvements to their evaluation harness.
Berlin, Germany
Aug 2021 - Dec 2021
Situated within the organization of Amazon Search, I was part of a newly formed team investigating the applicability of multilingual NLP solutions for search scenarios.
Heidelberg, Germany
Sept 2019 - Jan 2021
During my PhD, I was involved in a local start-up, building a document search engine for lawyers.
Walldorf, Germany
June 2018 - Sept 2018
At SAP, I was part of the Product & Innovations team, working on cloud-based ML solutions.
Heidelberg, Germany
Feb 2015 - Mar 2019
Throughout my studies, I worked at different groups within the Institute of Computer Science.
Oct 2018 - Mar 2019
Apr 2016 - Sept 2017
Feb 2015 - Mar 2016
|
June 2019 - exp. 2023
PhD in Computer Science; Supervised by: Prof. Dr. Michael GertzFocus Area: Text Summarization and NLPPublications:
|
|
|
Aug 2017 - May 2019
M.Sc. Applied Computer ScienceGerman GPA: 1.0 (with distinction; equiv. GPA: 4.0)Minor: Computational LinguisticsFocus Area: NLP and Network AnalysisThesis: "Implementation of a Relational Document Hypergraph for Information Retrieval"; Grade: 1.0 (with distinction) |
|
|
Sept 2017 - Apr 2018
Exchange Year, Computer Science ProgramCGPA: 3.95 out of 4.0Focus Area: Machine Learning and Algorithmic Game TheoryExtracurricular Activities:
|
|
|
Oct 2013 - Aug 2017
B.Sc. Applied Computer ScienceGerman GPA: 1.4 (equiv. GPA: 3.6)Minor: Computational LinguisticsFocus Area: Computer Graphics and VisualizationThesis: "Mining Relation Networks from University Websites"; Grade: 1.0 (with distinction) |
Automatically checking whether generated text is factually consistent with an input segment is still challenging. We present a method that goes against the recent trend of directly utilizing large language models to evaluate factuality, and instead propose a more linguistically grounded approach, based on Semantic Role Labels.
Relative to the English NLP community, we find that the quality of German summarization datasets (and models) is heavily lacking; oftentimes, not even basic filtering criteria are respected when training and evaluating systems.
We describe our winning submission to the shared task on lexical simplification. In principle, we extract structured predictions from GPT-3 generations, and introduce a novel way of aggregating predictions across multiple prompt templates to increase result coverage.
This work introduces a highly multilingual summarization corpus, available in all of the 24 official languages of the European Union. It is based on legal acts published by the EU, and consists of extremely long documents in the legal domain.
We present an interactive interface, unifying the access to several temporal annotation frameworks. Aside from a graphical interface, we allow users to programmatically access the various tools through a streamlined API interface.
We present a high-quality resource of full-text alignments between German Wikipedia, and a German children’s encyclopedia. This yields a dataset that we empirically show to be suited for both summarization and simplification tasks.
Following a series of prior experiments in English, we outline a generalized training procedure for training a non-English temporal tagger with weakly supervised data. For this purpose, we utilize existing rule-based taggers as a way to scale up existing training resources in low-resource settings by several orders of magnitude.
In this (German) article, we outline the challenges that are currently preventing mainstream adoption of recent NLP advancements in the legal industry. Primarily, this can be attributed to a lack of proper domain generalization, as well as limited interpretability and scalability of such models.
We experimented with various transformer-based architectures to see which ones would work best for extracting temporal annotations, such as ‘yesterday’ or ’every week’. However, we have since found a significant flaw in our evaluation setup for seq2seq-based models, so we decided to retract this article. Resulting tagging-based models are still valid, though, and are available online.
Utilizing existing segmentation tools, which primarily operate on sentence-level granularity, yields poor performance when segmenting long documents, which are prevalent in a legal context. In this work, we address the issue by proposing a weakly-supervised paragraph-based segmenter, which we empirically show on a novel dataset consisting of web Terms of Service documents.
We participated in the workshop’s shared task on extracting relevant paper sections in cited works. Interestingly, we show that our setup based on traditional search heuristics, coupled with improved pre-processing steps, outperforms our BERT-based retrieval setup. Overall, we placed third on the blind shared task test set.
This demonstration illustrates a time-centric approach to content exploration. Extracting and processing temporal mentions on large document collections allows a temporal expression of events, even when the documents themselves are not ordered chronologically.
Results from my Master’s thesis contributed the experimental section of this work. Primarly, we present a theoretical retrieval model based on hypergraphs, and demonstrate that these operations can be utilized to perform common Information Retrieval operations more efficiently on co-occurrence based word networks than traditional dyadic graphs.
Ordering events in a chronological fashion requires a accurate modeling of the temporal hierarchy, which previously was not well-defined for long-term event horizons spanning several decades. Here, we present a temporal model that is shown to work well, even without explicit temporal ordering in underlying document collections.
In a collaboration with molecular biosciences, we developed an automated Image Processing pipeline that was able to speed up the annotation process and accuracy for determining lengths of chromatosome strands in AFM images. In this work, it was ultimately shown how mutations in a particular gene can cause different winding patterns.