6.867 Machine Learning (Fall 2009)


Home

Syllabus

Lectures

Recitations

Projects

Problem sets

Exams

References

Matlab

Projects:

The projects are due on Thursday, December 3. Electronic submission is required and we accept only pdf documents.

The choice of the topic is up to you so long as it clearly pertains to the course material. To ensure that you are on the right track, you should talk to one of us about the project at least a month before the project is due. Similarly to problem sets, you are encouraged to collaborate on the project. We expect a four page write-up about the project, which should clearly and succintly describe the project goal, methods, and your results. Each group should submit only one copy of the write-up and include all the names of the group members (a two person group will have 6 pages, a three person group will have 8 pages, and so on). The projects will be graded on the basis of your understanding of the overall course material (not based on, e.g., how brilliantly your method works). The scope of the projet is about 1-2 problem sets.

The projects can be literature reviews, theoretical derivations or analyses, applications of machine learning methods to problems you are interested in, or something else (to be discussed with course staff).

Here are some examples:

  • apply/develop a machine learning method to solve a specific problem
    • a machine learning approach to classifying your incoming mail
    • predict stock prices based on past price variation
    • predict how people would rate movies, books, etc.
    • cluster gene expression data, how to modify existing methods to solve the problem better
  • Surveys/reviews
    • complexity of classifiers, different concepts, comparison
    • algorithmic stability, which methods have stability guarantees, and where could we apply these concepts
    • collaborative filtering, what methods are available to solve collaborative filtering problems, in which context have they been found effective
    • machine learning methods for genomic data, are they effective, what is missing
    • calibration, which methods are calibrated, how to modify a method so as to improve calibration
  • Theoretical problems
    • generalization guarantees for a specific algorithm (ask us)
    • learnability of specific concept classes (ask us)
    • convergence/consistency of a specific estimation method (ask us)