Data Mining
EECS-4412
Winter 2019
York University


Semester: Winter 2019
Course/Sect#: EECS-4412
Time: Mon 4:00pm-5:30pm
Wed 4:00pm-5:30pm
Location: MC 112
Instructor: Aijun An
Office: LAS 2048
Office Hours: Mon: 2:45-3:45pm
Phone #: 416-736-2100 x44298
e-mail: aan@cse.yorku.ca


Welcome to the Data Mining course, EECS-4412, for Winter 2019. Materials, instructions, and notices for the course will accumulate here over the semester.


Message Board

April 30, 2019
Grades are posted. You can check your grade and mark breakdown on ePost.
April 7, 2019
Please be reminded that the final exam will take place at 7:00am-9:30pm on Monday April 8. The location is ACW 106. Click here for some sample exam questoins.
April 2, 2019
Assignment 3 marks are posted. You can check yours by using ePost.
March 18, 2019
Project is posted. Please see the link below in the Assignments and Project section.
March 8, 2019
An FAQ page for Assignment #3 is created. Please see here.
March 8, 2019
Midterm solutions are posted. Click here to download.
March 5, 2019
For Q3 of Assignment #3, you will need to use PRISM, which does not come installed when you install Weka. Instructions on how to install PRISM in Weka can be found here.
March 4, 2019
Assignment #3 is posted. Please see the link below in the Assignments and Project section.
February 25, 2019
Solutions to Assignment #2 are posted. Please see here for solutions to Q1-Q3. Here are two sample programs for Q4 from two students in the class: myzscore.java and myzscore.py.
February 24, 2019
Solutions to Sample Midterm Questions are posted. Please see here.
February 19, 2019
Please be reminded that the midterm test will be held on Wednesday February 27 at the class time in CLH B. Note that the location is different from our regular classroom. For sample test questions, click here. The username and password are the same as the ones used for accessing the lecture notes.
February 13, 2019
Solutions to Assignment #1 are posted. Please see here. You can see your marked assignment by logging into the Web Submit System at https://webapp.eecs.yorku.ca/submit. Select 4412 for Course and A1 for Assignment. You should be able to see your marked assignment in a file named graded.pdf.
February 11, 2019
Today's office hour is moved to Wednesday February 13th at 3:00-3:50pm.
February 8, 2019
Assignment #2 is posted. Please see the link below in the Assignments and Project section.
February 6, 2019
Today's class is cancelled due to weather emergency. Stay safe and warm.
January 19, 2019
Assigment #1 is posted. Please see the link below in the Assignments and Project section. The access to the assignment is password-protected. The username and password have been sent to your eecs account.
January 7, 2019
This web page is set up. Welcome to the course!


Description

Data mining is the process of discovering interesting and useful knowledge or patterns in large data sets. It involves techniques and methods at the intersection of AI/machine learning, statistics and database systems. This course introduces fundamental concepts and principles of data mining, and presents various data mining algorithms and their applications. Topics include association rule mining, sequential pattern mining, classification models, clustering, and text mining.


Prerequisites

  • Required: a course on data structures and an introductory course on database systems.
  • Preferred: basic concepts in probability and statistics.


Materials

  • Textbook
      Jiawei Han, Micheline Kamber and Jian Pei, Data Mining - Concepts and Techniques, Morgan Kaufmann, Third Edition, 2011.
  • Reference Books and Materials
    • Charu C. Aggarwal, Data Mining, The Textbook, Springer, 2015.
    • Pang-Ning Tan, Michael Steinbach, Vipin Kumar, Introduction to Data Mining, Addison Wesley, 2006.
    • Ian H. Witten and Eibe Frank, Data Mining -- Practical Machine Learning Tools and Techniques (Second Edition), Morgan Kaufmann, 2005.
    • S.M. Weiss and N. Indurkhya, Predictive Data Mining, Morgan Kaufmann, 1998.
    • Margaret H. Dunham, Data Mining -- Introductory and Advanced Topics, Prentice Hall, 2003.
    • Some conference/journal papers


Grading Scheme

  • Assignments (25%)
  • Midterm (20%) (Wednesday February 27 at the class time in CLH B )
  • Project (20%)
  • Final exam (35%) (Monday April 8 at 7:00pm in ACW 106)


Lectures


Assignments and Project

  • Assignment 1 (Weight: 8%) (Due Friday February 1 by 10:00pm)
  • Assignment 2 (Weight: 6%) (Due Friday February 22 by 11:59pm). The input.txt file for Question 4 can be downloaded here
  • Assignment 3 (11%) (Due Sunday March 17 by 11:59pm). Instructions on how to install PRISM in Weka can be found here.
  • Project (20%) (Due Wednesday April 3 at 11:59pm)


Teaching Assistant

  • Amin Omidvar (omidvar@cse.yorku.ca)

Useful On-line Information

Academic Honesty