About me

Who am I?

I am a member of the project A3 (“Disambiguation of discourse connectors with corpus-induced semantic relations”) in the Collaborative Research Centre (SFB) 833 “Emergence of Meaning”:

SFB 833 “Bedeutungskonstitution”
Nauklerstr. 35
72074 Tübingen
Germany

where you can find me in room 2.07 or reach me at telephone number +49 (7071) 29 77155.

You can reach me at my email address,

Research Interests

What, Why and How?

I am interested in the question how the knowledge we have of the entities we talk about influences the way we talk about them, more specifically, what part of the whole "world knowledge" does actually influence the (syntactic and discourse) structure of our language (and how?). By extension, or sometimes as an interest of its own, I'm interested in techniques that allow the efficient construction of performant, interpretable, and accurate components for computational text understanding -- preferably without the effort that precludes anyone but a few big companies from using it.

I have defended my PhD thesis on the resolution of nominal anaphora using semantic information derived from large corpora, in June 2010. Check back soon for the final (corrected) version of the thesis.

I reviewed papers/articles for RANLP (2005), the ESSLLI Student session (2006, 2007), a journal issue containing selected papers from Konvens'06, for the CoNLL shared task on Dependency Parsing (2007), SemEval (2010), TLT (2008) as well as ACL (2008,2009), EMNLP (2011) and IJCAI (2011).

Nicer than before

but still in neglect

This site was made using (a modernized version of) the terrafirma layout from the Open-source Web Design page. I have changed the colors to match my graphics and changed the plant image to a view of Tübingen (near Stiftskirche) which I made in January 2006. Templating is done using jinja2 and kelvin.

Selected Publications

an entertaining read?

For a full list, see Google Scholar or Microsoft Academic.

Conference and Workshop papers.

Versley, Y. (2010)
Discovery of Ambiguous and Unambiguous Discourse Connectives via Annotation Projection. Workshop on the Annotation and Exploitation of Parallel Corpora (AEPC), Tartu, Estland. [pdf]
Versley, Y. and Rehbein, I. (2009)
Scalable Discriminative Parsing for German. International Conference on Parsing Technology (IWPT'09). [pdf]
Versley, Y. (2009)
Vagueness and Referential Ambiguity in a Large-scale Annotated Corpus. Massimo Poesio and Ron Artstein (eds.): Ambiguity in Anaphora. Special Issue of the Journal on Research in Language and Computation. [SpringerLink] [preprint pdf]
Versley, Y. (2008)
Decorrelation and Shallow Semantic Patterns for Distributional Clustering of Nouns and Verbs. ESSLLI'08 Workshop on Distributional Lexical Semantics. [pdf]
Versley, Y., Moschitti, A., Poesio, M. and Yang, X. (2008)
Coreference Systems based on Kernel Methods. Coling 2008. [pdf]
Versley, Y., Ponzetto, S.P., Poesio, M., Eidelman, V., Jern, A., Smith, J., Yang, X., Moschitti, A. (2008)
BART: A Modular Toolkit for Coreference Resolution. LREC 2008. [pdf]
Versley, Y., Ponzetto, S.P., Poesio, M., Eidelman, V., Jern, A., Smith, J., Yang, X., Moschitti, A. (2008)
BART: A Modular Toolkit for Coreference Resolution. ACL 2008 System demo. [pdf]
Versley, Y. (2007)
Antecedent Selection Techniques for High-Recall Coreference Resolution EMNLP-CoNLL 2007. [pdf]
Versley, Y. (2007)
Using the Web to Resolve Coreferent Bridging in German Newspaper Text GLDV-Frühjahrstagung 2007. [pdf]
Versley, Y. and Zinsmeister, H. (2006)
From Surface Dependencies towards Deeper Semantic Representations Fifth Workshop on Treebanks and Linguistic Theories (TLT 2006) . Due to technical issues, the title in the conference proceedings has been changed to "Semantic Representations" [pdf]
Versley, Y. (2006)
A Constraint-based Approach to Noun Phrase Coreference Resolution in German Newspaper Text Konferenz zur Verarbeitung Natürlicher Sprache (KONVENS 2006). [pdf]
Versley, Y. (2006)
Disagreement Dissected: Vagueness as a Source of Ambiguity in Nominal (Co-)Reference ESSLLI 2006 Workshop on Ambiguity in Anaphora . [pdf]
Versley, Y. (2005)
Parser Evaluation across Text Types Fourth Workshop on Treebanks and Linguistic Theories (TLT 2005) . [pdf] [pdf (slides)]
Schilder, F., Versley, Y., and Habel, Ch. (2004)
Extracting spatial information: grounding, classifying and linking spatial expressions. Workshop on Geographic Information Retrieval, 27th Annual International ACM SIGIR Conference. [pdf]
Schilder, F., Habel, Ch., and Versley, Y. (2003)
Temporal information extraction and question answering: Deriving answers for when-questions. Questions and Answers: Theoretical and Applied Perspectives (2nd CologNet-ElsNet Symposium).

My diploma thesis.

Yannick Versley (2004)
Tagging kausaler Relationen. Diplomarbeit. Fachbereich Informatik, Universität Hamburg
[pdf]
Updated version available as: Tagging kausaler Relationen: Grundlagen kausaler Ereignisrelationen und aktuelle Probleme; VDM Verlag Dr. Müller. ISBN 978-3-8364-3259-7

Blog posts

The brave new world of search engines
In an earlier post, I talked about current Google's search results in terms of personalization, and whether to like it or not. This post takes another aspect of 2011 Google search: what they do with complex queries. (more...)

Simple Pattern extraction from Google n-grams
Google has released n-gram datasets for multiple languages, including English and German. For my needs (lots of patterns, with lemmatization), writing a small bit of C++ allows me to extract pattern instances in bulk, more quickly and comfortably than with bzgrep. (more...)

Where to buy Music
After searching around a disproportionate time to find nice music that I want to buy, I decided to compile this list of internet shops that sell music in MP3 format to German citizens. (And no, I can't/won't use iTunes unless they make a Linux client).

Useful links

WCDG parser.
The Weighted Constraint Dependency Grammar parser which is one of the best parsers for German that you can get. It's available under an open source license and there is an online demo.

BitPar and SFST.
Helmut Schmid has written several tools that may come in useful in your next NLP application, including the TreeTagger, a decision-tree based part of speech tagger, BitPar, a fast PCFG parsing engine, and SFST, a set of highly useful tools for finite-state morphology analysis.

Conditional Random Fields.
Hanna Wallach has a very useful link collection on Conditional Random Fields. I'd recommend especially her tutorial on CRFs (which is also the introductory part of her MSc thesis) as well as Simon Lacoste-Juliens tutorial on SVMs, graphical models, and Max-Margin Markov Networks (also linked there).

Nice blogs

Language Log
NLPers
hunch.net
Technologies du Langage
Earning my Turns
Leiter Reports