Proseminar: Natural Language Processing and the Web

Summer Semester 2017


Online Registration



Instructor: Michael Wiegand

Location: U15, Building C7.1

Time: Thursdays, 14-16

Begin: April 20th, 2017

Suitable for: B.Sc.


Latest announcements

Please remember that from now onwards, we will again meet in U15.


Course Description

In this course, we will address what impact the Web has on Natural Language Processing (NLP). Compared to the existing text corpora which have been used in the past, the Web is much larger and if considered as a corpus, it can be used to extract phenomena which are too sparsely represented in traditional corpora. Some specific sites, such as Wikipedia, represent useful knowledge bases that can also be harnessed for NLP applications.
However, since the language of the Web, particularly the social media, differs quite dramatically from conventional text corpora employed in NLP, software tools also have to be adapted.
Finally, the Web also yields some problematic issues, such as hate speech or fake reviews, whose detection can be solved with the help of NLP.
Most papers that are going to be presented by the students in this proseminar will have a linguistic focus. Some basic understanding of machine learning (in the scope of "Mathematische Grundlagen III") would be helpful.




27.04.2017MWRecap on machine learning and evaluation----
04.05.2017MWHow to present a paper----
18.05.2017Stefan GruenewaldCrowdsourcingAmazon Mechanical Turk: Gold Mine or Coal Mine? (presentation)Valentin Kany
25.05.2017--bank holiday----
01.06.2017David MeierDistant SupervisionUsing Wikipedia for Automatic Word Sense Disambiguation (presentation)Jana Jungbluth
08.06.2017Jana JungbluthDeception DetectionFinding Deceptive Opinion Spam by Any Stretch of the Imagination (presentation)Stefan Gruenewald
15.06.2017--bank holiday----
22.06.2017Valentin KanyHate SpeechAbusive Language Detection in Online User Content (presentation)David Meier
29.06.2017MWHow to write a term paperreference document (presentation)--
13.07.2017oral exam----


Papers to be Discussed


Requirements for Attendance

  • The language of the course is German
  • Students should have passed the courses: Einf├╝hrung in die Computerlinguistik, Mathematische Grundlagen III


Requirements for Passing the Course

  • oral presentation
  • term paper (Hausarbeit)
  • reviewing a paper presented by another student


Useful Links




Last update: 2017/06/29