6.806/6.864 Projects

Click here to see which topic groups you and others are interested in.

Click here to see students who are also looking for teammates.

FAQ

Q: How do I select a topic for my project?

A: We provide on this page a number of possible projects to choose from. These projects are of various level of difficulty. Feel free to modify these tasks to fit your interests.

In addition, here you can find project abstracts from last year. You can use them as inspiration for your work.

We also encourage students to come up with their own projects. Please make sure to meet with the instructors to discuss your proposals. Even if you have a vague idea, we can help you to formulate it into a project.

Q: What is the difference between projects for 6.806 and 6.864?

A: Projects of 6.864 students are expected to produce a research quality paper which addresses a new topic, positions it in the body of related work, develops, implements, and evaluates the proposed approach. Some of the proposed projects describe new problems and formulations.

Projects for 6.806 students focus on implementing existing methods, and exploring and analyzing its performance. A typical project will include reimplementing a method described in a research paper, experimenting with features, model variations and different datasets. The write-up should clearly describe your explorations, and present an insightful analysis.

Q: Can a group be mixed in the sense of including students registered for 6.806 and 6.864?

A: The project group can be mixed. However, 6.864 students are expected to contribute more. See the differences between 6.806 and 6.864 above. For example, a student registered in 6.806 could explore and experiment with an existing method that the new method (primarily developed by a student taking 6.864) would be compared to.

Q: What should be the size of the group?

A: The size of a team is four students. This size would enable you to complete a significant piece of work, while sharing the burden in implementation, annotation, reading of related work and data analysis.

In order to grade your effort, we will need to know the exact contribution of each team member. We will assess your participation during meetings with staff.

Q: Can multiple teams work on the same project?

A: Please feel free to work on the same projects. We encourage such teams to collaborate on data collection and evaluation.

Q: How can I find a team?

A: We will organize a matching event where you can meet with potential teammates. You have to fill out this form by 10/7.

Project ideas for 6.864 students (graduate level)

  1. The goal of this project is to generate new recipes by tailoring them to match pre-specified constraints (e.g., number of calories). While there are typically many possible modifications, your substitutions shouldn't destroy the quality of the food. Therefore, as part of the project, you will need to be able to automatically assess the quality of each recipe. For example, you could design and learn a scoring function over recipes. With such a function, you could introduce modifications while ensuring that the quality doesn't suffer (too much). A scaled down version of the project makes this assessment based on the ingredients alone. A more advanced version would take into account the body of the recipe (how ingredients are prepared or combined). Multiple websites provide large collections of readily available recipes. These corpora contain multiple examples of tasty recipes which can be used for training your system.

  2. The goal of this project is to modify a recipe based on user comments. Often users who tried the recipe also provide suggestions that correct or improve upon the original. Your task would be to a) automatically identify suggestions in the review body, and b) modify the original accordingly. In some cases, these modifications include inserting a new step in the recipe body, or altering instructions already there. If you are not into cooking, you can apply your model in other domains where instructions (akin to recipes) can be modified based on user feedback (e.g., do-it-yourself or computer maintenance instructions).

  3. A successful recipe specifies a full plan that transforms ingredients into a final dish. Imagine that you are trying to build a robot that executes a recipe. However, current recipes are written for humans: the instructions are not necessarily ordered chronologically, coreference and omissions obscure which action applies to which object(s). Your goal is to transform a recipe into a fully executable machine readable plan with objects and actions. As before, if you are not interested in cooking, you can apply this idea to Minecraft plans, do-it-yourself instructions, etc.

  4. Dynamic Information Extraction: Most existing information extraction systems operate on a pre-determined set of documents. However, a single article might not contain all the details of a specific event. What if we could intelligently search for more articles on the same event and gather more details? The challenges in this task are in formulating the query to retrieve relevant articles and in identifying which entity mentions are important in the text. Data sources: 1) List of mass murders. 2) ACE 2005 corpus.

  5. Multilingual Unsupervised Morphology: Recent work (http://people.csail.mit.edu/karthikn/pdfs/narasimhan2015tacl.pdf) has shown the effectiveness of using chains in performing unsupervised morphological analysis. Working at the word level provides advantages such as being able to use word embeddings and word-level statistics from a raw corpus. An interesting follow-up would be to perform induction of such chains in a multilingual setting where one can have access to word-level translations between the two languages. For instance, the French words for 'walk -> walker -> walkers' are 'marche -> marcheur -> marcheurs', respectively. Knowing that 'walker' translates to 'marcheur' can help in the accurate induction of chains in both languages.

  6. One of the advantages of neural networks for parsing is that they can easily turn candidate choices or configurations represented by dense vectors into scores. The resulting scores can be used to guide larger pieces of parse trees as part of an overall structured prediction method. For example, we could train a neural network to score the head and child(ren) choices for each word in the sentence. The goal of this project is to formulate rich neural scoring functions and train them end-to-end with the help of approximate inference techniques such as randomized greedy.

  7. A transition based parser is limited by its myopic left-right-shift actions. There are many ways to overcome this limitation. For example, in a recent method the authors use a neural transition based parser simply as a feature generator, making use of its hidden unit activations and action probabilities as input features into a multi-way classifier. The key difference is that the linear classifier is trained end-to-end as a structured prediction method to get the whole sequence of transitions correctly. But why would we want to estimate a complex feature generator rather than a sequence of simple ones, estimated to complement the ones earlier included? The goal of this project is to explore boosting methods for parsing.

  8. Reinforcement learning is a natural way to think about transition based parsing. Each configuration represents a state in a complex “game” where rewards following actions are accumulated along the way based on arc choices but are often delayed (e.g., due to shift). We can think of the transition decisions as a policy, a parameterized distribution over actions that depends on the state (configuration). The goal is to train such a policy to maximize the cumulative reward. The goal of this project is to adapt, e.g., policy gradient methods from reinforcement learning to improve the training of transition based parsers.

Project ideas for 6.806 students (undergraduate level)

Each of the papers below contains novel applications of NLP techniques. They all provide data and evaluation metrics along with baseline systems. Your goal can be to explore alternative models (e.g. DNNs) to perform these tasks. Note that this is just a sample - you are welcome to choose other papers of your liking from recent ACL/NAACL/EMNLP conferences.

Joint Models of Disagreement and Stance in Online Debate (paper)

Abstract: Online debate forums present a valuable opportunity for the understanding and modeling of dialogue. To understand these debates, a key challenge is inferring the stances of the participants, all of which are interrelated and dependent. While collectively modeling users' instances has been shown to be effective (Walker et al., 2012c; Hasan and Ng, 2013), there are many modeling decisions whose ramifications are not well understood. To investigate these choices and their effects, we introduce a scalable unified probabilistic modeling framework for stance classification models that 1) are collective, 2) reason about disagreement, and 3) can model stance at either the author level or at the post level. We comprehensively evaluate the possible modeling choices on eight topics across two online debate corpora, finding accuracy improvements of up to 11.5 percentage points over a local classifier. Our results highlight the importance of making the correct modeling choices for online dialogues, and having a unified probabilistic modeling framework that makes this possible.

Compositional Vector Space Models for Knowledge Base Completion (paper)

Abstract: Knowledge base (KB) completion adds new facts to a KB by making inferences from existing facts, for example by inferring with high likelihood nationality(X,Y) from bornIn(X,Y). Most previous methods infer simple one-hop relational synonyms like this, or use as evidence a multi-hop relational path treated as an atomic feature, like bornIn(X,Z) -> containedIn(Z,Y). This paper presents an approach that reasons about conjunctions of multi-hop relations non-atomically, composing the implications of a path using a recurrent neural network (RNN) that takes as inputs vector embeddings of the binary relation in the path. Not only does this allow us to generalize to paths unseen at training time, but also, with a single high-capacity RNN, to predict new relation types not seen when the compositional model was trained (zero-shot learning). We assemble a new dataset of over 52M relational triples, and show that our method improves over a traditional classifier by 11%, and a method leveraging pre-trained embeddings by 7%.

Linguistic Harbingers of Betrayal: A Case Study on an Online Strategy Game (paper)

Abstract: Interpersonal relations are fickle, with close friendships often dissolving into enmity. In this work, we explore linguistic cues that presage such transitions by studying dyadic interactions in an online strategy game where players form alliances and break those alliances through betrayal. We characterize friendships that are unlikely to last and examine temporal patterns that foretell betrayal. We reveal that subtle signs of imminent betrayal are encoded in the conversational patterns of the dyad, even if the victim is not aware of the relationship’s fate. In particular, we find that lasting friendships exhibit a form of balance that manifests itself through language. In contrast, sudden changes in the balance of certain conversational attributes - such as positive sentiment, politeness, or focus on future planning signal impending betrayal.

Modeling Argument Strength in Student Essays (paper)

Abstract: While recent years have seen a surge of interest in automated essay grading, including work on grading essays with respect to particular dimensions such as prompt adherence, coherence, and technical quality, there has been relatively little work on grading the essay dimension of argument strength, which is arguably the most important aspect of argumentative essays. We introduce a new corpus of argumentative student essays annotated with argument strength scores and propose a supervised, feature-rich approach to automatically scoring the essays along this dimension. Our approach significantly outperforms a baseline that relies solely on heuristically applied sentence argument function labels by up to 16.1%.

Resources

Inspirational talks:

NLP literature:

NLP Tools:

ML Tools:

Corpora: