Author Image

Hi, I am Dennis!

Dennis Aumiller

Member of Technical Staff at Cohere

I am currently a Member of Technical Staff at Cohere, where I focus on improving human evaluation for Large Language Models. Before that, I obtained my PhD from the Data Science Group at Heidelberg University. My academic work primarily focused on automatic text summarization, with a special interest in generating aspect-focused summaries. Beyond that, I do have a broader appreciation for research in applied Machine Learning for Natural Language Processing (NLP). I previously interned with Amazon Research and SAP, and helped build a search engine for lawyers at Codefy. During my PhD studies, I committed to releasing most of my research artifacts on Github. However, my biggest contributions to open source are my questions and answers on Stackoverflow, which have by now reached more than one million people worldwide.

Skills

Python

Python is the language I work with on a daily basis. I am a strong proponent of PEP-8 and typing. I have previously published and maintained my own packages on PyPI.

Recent Posts

Reflections on Reaching 1 Million People on Stackoverflow

This week, I have officially reached more than 1 million people on the website stackoverflow.com! I wanted to take a moment to reflect on this “achievement”, what it means for my professional career, and why I simultaneously believe that it is sheer luck (and a LOT of procrastination) that got me here.
As always, if you have any questions or comments, feel free to message me at dennis.aumiller@gmail.com or reach out on Twitter.

Discovery of the New Cohere Summarization Endpoint

Two weeks ago, Cohere.ai announced their new dedicated summarization endpoint! For someone currently doing their PhD on text summarization, this is both worrying, but obviously also a rather intriguing development: while recent advancements have been focusing on rather broadly applicable models (think, chatGPT), providing more task-specific alternatives seems to be the niche that Cohere is carving out for themselves.
Adding to the surprise of seeing a dedicated summarization endpoint is the fact that text summarization is really hard; in the last 50 years, a lot of progress has been made, but our current state-of-the-art models still suffer from annoying problems such as correctly retaining factuality of the input text. Another problem is the actual definition of “summaries” in different domains. Methods for generating a “good” summary of a news article are generally useless when it comes to generating the summary of a court ruling, or generating radiology reports from doctor notes. Due to the combination of these (and other) factors, there are comparatively few productive settings in which summarization is actively used. To my knowledge, the two main applications using some form of summarization right now are some news aggregators, summarizing information from multiple news articles (which primarily uses extractive methods, meaning directly copying existing sentences from the input documents), as well as the recently introduced “Document TL;DR” generator in Google Docs (the latter using a variant of their own PEGASUS neural model).

Professional Experience

1
Member of Technical Staff
Cohere

Philadelphia, USA (remote)

Sep 2023 - current

As part of the data and evaluation team, my main objectives are to improve Cohere’s primary generative model through a series of automated data transformations, as well as improvements to their evaluation harness.


Applied Scientist Intern
Amazon Research

Berlin, Germany

Aug 2021 - Dec 2021

Situated within the organization of Amazon Search, I was part of a newly formed team investigating the applicability of multilingual NLP solutions for search scenarios.

Responsibilities:
  • Investigated sequential recommendation systems for customer query suggestions
  • Built dataset and implemented a training/evaluation pipeline for neural search query generation
  • Found flaws for tail queries in the existing live system and reported preliminary improvements in recall with my own sequential recommender tool
2

3
Software Engineer (part-time)
Codefy GmbH

Heidelberg, Germany

Sept 2019 - Jan 2021

During my PhD, I was involved in a local start-up, building a document search engine for lawyers.

Responsibilities:
  • Built backend for product prototype, helping the company secure 200.000€ in seed funding
  • Primary lead on document processing, developing a pipeline suitable for diverse document types
  • Optimized document database operators, improving ingestion time by over 30%
  • Built prototype for unsupervised keyphrase extraction module on legal documents

Software Engineering Intern
SAP SE

Walldorf, Germany

June 2018 - Sept 2018

At SAP, I was part of the Product & Innovations team, working on cloud-based ML solutions.

Responsibilities:
  • Optimization of Machine Learning operations with randomized algorithms
  • Achieved up to 1000x speed-up while maintaining comparable accuracy
  • Extensively benchmarked solutions to highlight performance discrepancies in existing tools
4

5
Heidelberg University

Heidelberg, Germany

Feb 2015 - Mar 2019

Throughout my studies, I worked at different groups within the Institute of Computer Science.

Teaching Assistant

Oct 2018 - Mar 2019

  • Teaching Assistant for the graduate lecture “Complex Network Analysis”
  • Additionally, helped design the assignments and final exam
Teaching Assistant

Apr 2016 - Sept 2017

  • Teaching Assistant for the lectures “Databases 1” in Summer 2016 and summer 2017
  • Lecture Assistant for the graduate course “Computer Graphics” in Winter 2016/17
  • Prepared and held weekly tutorial sessions for students and graded assignments
Student Assistant

Feb 2015 - Mar 2016

  • Tasked with the integration of a group’s website into the new corporate identity template

Education

PhD in Computer Science; Supervised by: Prof. Dr. Michael Gertz
Focus Area: Text Summarization and NLP
Publications:
Aug 2017 - May 2019
M.Sc. Applied Computer Science
German GPA: 1.0 (with distinction; equiv. GPA: 4.0)
Minor: Computational Linguistics
Focus Area: NLP and Network Analysis
Thesis: "Implementation of a Relational Document Hypergraph for Information Retrieval"; Grade: 1.0 (with distinction)
Sept 2017 - Apr 2018
Exchange Year, Computer Science Program
CGPA: 3.95 out of 4.0
Focus Area: Machine Learning and Algorithmic Game Theory
Extracurricular Activities:
  • Executive Member, UofT eSports Club
  • Executive Member, Undergraduate AI Group
Oct 2013 - Aug 2017
B.Sc. Applied Computer Science
German GPA: 1.4 (equiv. GPA: 3.6)
Minor: Computational Linguistics
Focus Area: Computer Graphics and Visualization
Thesis: "Mining Relation Networks from University Websites"; Grade: 1.0 (with distinction)

Publications

Evaluating Factual Consistency of Texts with Semantic Role Labeling
*SEM @ ACL 2023 July 2023

Automatically checking whether generated text is factually consistent with an input segment is still challenging. We present a method that goes against the recent trend of directly utilizing large language models to evaluate factuality, and instead propose a more linguistically grounded approach, based on Semantic Role Labels.

On the State of German (Abstractive) Text Summarization
BTW 2023 Mar 2023

Relative to the English NLP community, we find that the quality of German summarization datasets (and models) is heavily lacking; oftentimes, not even basic filtering criteria are respected when training and evaluating systems.

UniHD at TSAR-2022 Shared Task: Is Compute All We Need for Lexical Simplification?

We describe our winning submission to the shared task on lexical simplification. In principle, we extract structured predictions from GPT-3 generations, and introduce a novel way of aggregating predictions across multiple prompt templates to increase result coverage.

EUR-Lex-Sum: A Multi-and Cross-lingual Dataset for Long-form Summarization in the Legal Domain
EMNLP 2022 Dec 2022

This work introduces a highly multilingual summarization corpus, available in all of the 24 official languages of the European Union. It is based on legal acts published by the EU, and consists of extremely long documents in the legal domain.

Online DATEing: A Web Interface for Temporal Annotations
SIGIR 2022 July 2022

We present an interactive interface, unifying the access to several temporal annotation frameworks. Aside from a graphical interface, we allow users to programmatically access the various tools through a streamlined API interface.

Klexikon: A German Dataset for Joint Summarization and Simplification
LREC 2022 June 2022

We present a high-quality resource of full-text alignments between German Wikipedia, and a German children’s encyclopedia. This yields a dataset that we empirically show to be suited for both summarization and simplification tasks.

Time for some German? Pre-Training a Transformer-based Temporal Tagger for German

Following a series of prior experiments in English, we outline a generalized training procedure for training a non-English temporal tagger with weakly supervised data. For this purpose, we utilize existing rule-based taggers as a way to scale up existing training resources in low-resource settings by several orders of magnitude.

Deep Learning und Legal Tech - Eine Bestandsaufnahme

In this (German) article, we outline the challenges that are currently preventing mainstream adoption of recent NLP advancements in the legal industry. Primarily, this can be attributed to a lack of proper domain generalization, as well as limited interpretability and scalability of such models.

BERT got a Date: Introducing Transformers to Temporal Tagging
arXiv Sept 2021

We experimented with various transformer-based architectures to see which ones would work best for extracting temporal annotations, such as ‘yesterday’ or ’every week’. However, we have since found a significant flaw in our evaluation setup for seq2seq-based models, so we decided to retract this article. Resulting tagging-based models are still valid, though, and are available online.

Structural Text Segmentation of Legal Documents
ICAIL 2021 June 2021

Utilizing existing segmentation tools, which primarily operate on sentence-level granularity, yields poor performance when segmenting long documents, which are prevalent in a legal context. In this work, we address the issue by proposing a weakly-supervised paragraph-based segmenter, which we empirically show on a novel dataset consisting of web Terms of Service documents.

UniHD @ CL-SciSumm 2020: Citation Extraction as Search
SDP @ EMNLP 2020 Nov 2020

We participated in the workshop’s shared task on extracting relevant paper sections in cited works. Interestingly, we show that our setup based on traditional search heuristics, coupled with improved pre-processing steps, outperforms our BERT-based retrieval setup. Overall, we placed third on the blind shared task test set.

TiCCo: Time-Centric Content Exploration
CIKM 2020 Oct 2020

This demonstration illustrates a time-centric approach to content exploration. Extracting and processing temporal mentions on large document collections allows a temporal expression of events, even when the documents themselves are not ordered chronologically.

A Versatile Hypergraph Model for Document Collections
SSDBM 2020 July 2020

Results from my Master’s thesis contributed the experimental section of this work. Primarly, we present a theoretical retrieval model based on hypergraphs, and demonstrate that these operations can be utilized to perform common Information Retrieval operations more efficiently on co-occurrence based word networks than traditional dyadic graphs.

Time-centric Exploration of Court Documents

Ordering events in a chronological fashion requires a accurate modeling of the temporal hierarchy, which previously was not well-defined for long-term event horizons spanning several decades. Here, we present a temporal model that is shown to work well, even without explicit temporal ordering in underlying document collections.

DNA accessibility of chromatosomes quantified by automated image analysis of AFM data
Scientific Reports Sept 2019

In a collaboration with molecular biosciences, we developed an automated Image Processing pipeline that was able to speed up the annotation process and accuracy for determining lengths of chromatosome strands in AFM images. In this work, it was ultimately shown how mutations in a particular gene can cause different winding patterns.