Data Mining
EECS-4412
Winter 2020
York University


Semester: Winter 2020
Course/Sect#: EECS-4412
Time: Mon 4:00pm-5:30pm
Wed 4:00pm-5:30pm
Location: LSB 107
Instructor: Aijun An
Office: LAS 2048
Office Hours: Mon: 2:45-3:45pm (Office hour will start on Monday January 13)
Phone #: 416-736-2100 x44298
e-mail: aan@eecs.yorku.ca


Welcome to the Data Mining course, EECS-4412, for Winter 2020. Materials, instructions, and notices for the course will accumulate here over the semester.


Message Board

May 1, 2020
Grades are posted. You can check your grade and mark breakdown on ePost. Also, if you are interested in learning Data Warehouse and OLAP, I have put my lecture nodes on this topic on the Lecture Notes page.
April 16, 2020
Please be reminded that the final exam will be held today at 7:00pm. The duration of the exam will be 2 hours.
April 13, 2020
More information about the exam on April 16 is posted at here. Please read.
April 12, 2020
Please be reminded that the final exam will take place online at 7:00pm on Thursday April 16 on Moodle. Please login in at https://moodle.yorku.ca/moodle/course/view.php?id=162498 with your Passport York account, and you will find the link to "Final Exam" when the exam opens at 7:00pm on April 16. The duration of the exam will be announced as soon as it is determined. If you have not done the practice test on April 1, you can do so at the same link above and click on "Practice Test", just to get familiar with the format of the exam. You are also required to join Zoom a few minutes before the exam starts. The Zoom link will be sent to you via an email.
April 6, 2020
The marked A3 papers are uploaded to Web Submit site. Please log into it and choose 4412 as Course and A3 as Assignment. You will see your marked A3 paper with a file name ending with "-Marked.pdf".
March 24, 2020
Video for yesterday's office hour is posted on the Project page, in which TA and I answered some of your questions about the project, and the TA showed how to use WEKA to load text data.
March 18, 2020
Project is posted. Please see the link below in the Assignments and Project section.
March 14, 2020
Office hours and lectures will be conducted online via Zoom starting from Monday March 16. Please check your email for the links to them.
March 13, 2020
An FAQ page for Assignment #3 is created. Please see
here.
March 11, 2020
Midterm marks and solutions are posted. You can check your mark by using ePost. For solutions, click here to download.
March 9, 2020
Assignment 1 marks are posted. You can check yours by using ePost. A1 papers were handed back to the students in class today. If you missed the class, you can come to my office hour next week to pick it up.
March 5, 2020
Assignment #3 is posted. Please see the link below in the Assignments and Project section.
March 3, 2020
Midterm has be moved to Wednesday March 4 at 4:00-5:30pm in CLH K. Please note the new location!
March 2, 2020
Please be reminded that the midterm test will be held today at 4:00-5:20pm in DB 0010. Note that the location is different from our regular classroom.
March 1, 2020
Solutions to Assignment #2 are posted. Please see here.
February 28, 2019
Solutions to Sample Midterm Questions are posted. Please see here.
February 26, 2019
Solutions to Assignment #1 are posted. Please see here. Also, our TA will have another office hour to answer your questions about the course on Thursday February 27 at 3:00-4:00pm in his office (LAS 3053 which is the BRAIN Lab).
February 21, 2019
Please be reminded that the midterm test will be held on Monday March 2 at the class time in DB 0010. Note that the location is different from our regular classroom. For sample test questions, click here. The username and password are the same as the ones used for accessing the lecture notes.
February 7, 2020
Assignment #2 is posted. Please see the link below in the Assignments and Project section.
February 4, 2020
The due time of Assignment #1 is extended to 11:00pm tonight.
January 20, 2020
Assigment #1 is posted. Please see the link below in the Assignments and Project section. The access to the assignment is password-protected. The username and password have been sent to your eecs account.
January 5, 2020
This web page is set up. Welcome to the course!


Description

Data mining is the process of discovering interesting and useful knowledge or patterns in large data sets. It involves techniques and methods at the intersection of AI/machine learning, statistics and database systems. This course introduces fundamental concepts and principles of data mining, and presents various data mining algorithms and their applications. Topics include association rule mining, sequential pattern mining, classification models, clustering, and text mining.


Prerequisites

  • Required: a course on data structures and an introductory course on database systems.
  • Preferred: basic concepts in probability and statistics.


Materials

  • Textbook
      Jiawei Han, Micheline Kamber and Jian Pei, Data Mining - Concepts and Techniques, Morgan Kaufmann, Third Edition, 2011.
  • Reference Books and Materials
    • Charu C. Aggarwal, Data Mining, The Textbook, Springer, 2015.
    • Pang-Ning Tan, Michael Steinbach, Vipin Kumar, Introduction to Data Mining, Addison Wesley, 2006.
    • Ian H. Witten and Eibe Frank, Data Mining -- Practical Machine Learning Tools and Techniques (Second Edition), Morgan Kaufmann, 2005.
    • S.M. Weiss and N. Indurkhya, Predictive Data Mining, Morgan Kaufmann, 1998.
    • Margaret H. Dunham, Data Mining -- Introductory and Advanced Topics, Prentice Hall, 2003.
    • Some conference/journal papers


Grading Scheme

  • Assignments (25%)
  • Midterm (20%) (March 2 in DB 0010 at the class time)
  • Project (20%, revised to 30%)
  • Final exam (35%, revised to 25%) (Thursday April 16 at 7:00pm on Moodle)


Lectures


Assignments and Project

  • Assignment 1 (Weight: 8%) (Due Tuesday February 4 by 11:00pm)
  • Assignment 2 (Weight: 6%) (Due Tuesday February 25 by 11:00pm).
  • Assignment 3 (11%) (Due Tuesday March 17 by 11pm). Instructions on how to install PRISM in Weka can be found here.
  • Project (30%) (Due Monday April 6 at 11:00pm)


Teaching Assistant

  • Amin Omidvar (omidvar@cse.yorku.ca)

Useful On-line Information

Academic Honesty