Reading List:
Choose one of the papers listed below and email me your selection.
The papers in red have been chosen by a student.
Outlier Detection
Association, Frequent Pattern, Emerging Pattern, and High Utility Pattern Mining
- Scalable
Techniques for Mining Causal Structures, Craig
Silverstein, Rajeev Motwani, Sergey Brin, and Jeff D.
Ullman, Proceedings of the 24th International Conference
on Very Large Data Bases (VLDB), 1998
- Xindong Wu, Chengqi Zhang and Shichao Zhang, Efficient Mining of Both Positive and
Negative Association Rules. ACM Transactions on Information Systems, 22(2004), 3: 381-405.
(SCI).
- Guozhu Dong and Jinyan Li Efficient Mining
of Emerging Patterns: Discovering Trends and Differences, KDD 1999: 43-52.
- Hongjian Fan and Kotagiri Ramamohanarao, Efficiently Mining Interesting Emerging
Patterns, Proceedings of WAIM, 2003.
- Jiong Yang, Wei Wang, Philip S. Yu: Infominer:
mining surprising periodic patterns. KDD 2001: 395-400
- Wan, Q. and An., A. Discovering Transitional Patterns and Their Significant Milestones in Transaction Databases, IEEE Transactions on Knowledge and Data Engineering (TKDE), Vol.21, No.12, 2009.
- M. Liu and J. Qu. Mining high utility itemsets without candidate generation,
Proceedings of the 21st ACM International Conference on Information and Knowledge Management, CIKM '12, pages 55-64,
New York, NY, USA, 2012.
- Vincent S. Tseng, Cheng-Wei Wu1, Bai-En Shie, and Philip S. Yu, UP-Growth: An Efficient
Algorithm for High Utility Itemset Mining, Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Pages 253-262, ACM New York, NY, USA, 2010.
Frequent Sequence Mining
- M.J. Zaki. SPADE: An
Efficient Algorithm for Mining Frequent Sequences, Machine Learning, Vol.42, No.1/2, 2001.
- CloSpan:
Mining Closed Sequential Patterns in Large Databases, Xifeng Yan, Jiawei Han, Ramin
Afshar, Proceedings of the Third SIAM International Conference on Data Mining, San Francisco,
CA, USA, May, 2003.
Decision Tree Learning
- Q. Yang, J. Yin, C. X. Ling and R. Pan, Extracting Actionable Knowledge from
Decision Trees, IEEE Transactions on Knowledge and Data Engineering (IEEE TKDE), 19(1). 43-56, 2007.
- Johannes Gehrke , Raghu Ramakrishnan , Venkatesh Ganti.
RainForest: A framework for fast decision tree
construction of large datasets, In VLDB'98, pp. 416-427, New York, NY, 1998.
-
Learning Trees and Rules with Set-valued
Features, William W. Cohen, Proceedings of the Thirteenth National
Conference on Artificial Intelligence (AAAI-96), 1996.
- Cesar Ferri, Peter Flach and Jose Hernandez-Orallo, Learning Decision Trees Using the Area Under the ROC Curve, Proceedings of the 19th International Conference on Machine Learning, Morgan Kaufmann, July 2002,
pp.139-146.
Decision Rule Learning
- Quinlan, J. R. and Cameron-Jones, R. M. FOIL: A
Midterm Report. Proc. of ECML, Vienna, Austria, 1993. pp3-20.
- Frederic Stahla1 and Max Bramer, Scaling up classification rule induction through parallel
processing, Knowledge Engineering Review, Vol.28, No.4, December 2013. (Chosen by Hao Li)
Support Vector Machines
Bayesian Network Learning
Deep Neural Networks
Clustering
- A Neighborhood-based Clustering Algorithm
S. Zhou, Y. Zhao, J. Guan and J. Huang, Proceedings of the 9th Pacific-Asia conference on Advances
in Knowledge Discovery and Data Mining (PAKDD'05), 2005, pp.361-371. (Chosen by Po Wu).
- Kiri Wagstaff, Claire Cardie, Seth Rogers and Stefan Schroedl, Constrained K-means Clustering with Background Knowledge, Proceedings of the Eighteenth International Conference on Machine Learning, 2001, p. 577-584. (Chosen by Nasim Razavi)
-
CACTUS-Clustering Categorical Data Using Summaries, Venkatesh Ganti,
Johannes Gehrke, Raghu Ramakrishnan, Proc. 5th ACM SIGKDD
International Conference on Knowledge Discovery and Data
Mining (KDD-99), 1999 Aug, pp. 73-83.
- ROCK:
A Robust Clustering Algorithm for Categorical Attributes,
Sudipto Guha, Rajeev Rastogi, Kyuseok Shim, Proceedings
of the 15th International Conference on Data Engineering,
23-26 March 1999, Sydney, Austrialia, IEEE CS Press,
1999, pp. 512-521.
-
BIRCH: an efficient data clustering method for very large
databases, Tian Zhang, Raghu Ramakrishnan, Miron
Livny, Proceedings of the 1996 ACM SIGMOD international
conference on Management of data , 1996, pp. 103-114.
- CURE:
An Efficient Clustering Algorithm for Large Databases,
Sudipto Guha, Rajeev Rastogi, Kyuseok Shim, Proceedings
of the ACM SIGMOD Conference, 1998.
Data Stream Mining
- Frequent item(set) Mining
- Classification
-
On Demand Classification of Data Streams, Aggarwal, Han, Wang, and Yu, KDD'04.
-
Mining Time-Changing Data Streams, by Geoff Hulten,
Laurie Spencer, Pedro Domingos, in the ACM International Conference on
Knowledge Discovery and Data Mining (SIGKDD) 2001.
- F. Ferrer-Troyano, J. Aguilar-Ruiz and J. Riquelme, Incremental Rule Learning and Border
Examples Selection from Numerical Data Streams, J. of Universal Computer Science,11(8), 2005.
- M. Maloof and R. Michalski,
Incremental learning with partial instance memory, Artificial Intelligence, Vol.154, Issue 1-2,
April 2004.
- H. Wang, W. Fan, P. Yu and J. Han.
Mining Concept-drifting Data Streams Using Ensemble Classifiers, Proceedings of ACM SIGKDD Conference, 2003.
- D. Sotoudeh and A. An. Partial Drift Detection Using a Rule Induction Framework, Proceedings of the 19th ACM International Conference on Information and Knowledge Management, Toronto, Canada, October 26-30, 2010.
- Clustering
- Concept Drift Detection
- Tamraparni Dasu, Shankar Krishnan, Suresh Venkatasubramanian, and Ke Yi,
An
Information-Theoretic Approach to Detecting Changes in Multi-Dimensional Data
Streams, Proceedings of the 38th Symposium on the Interface of Statistics,
Computing Science, and Applications, pages 1-24, 2006.
- P. Vorburg and A. Bernstein. Entropy-based concept shift detection.
Proceedings of the Sixth International Conference on Data Mining, pages
1113-1118, 2006.
- Evaluation
Social Network Analysis
- Manuel Gomez-Rodriguez, Jure Leskovec and Andreas Krause,
Inferring Networks of Diffusion and Influence, ACM SIGKDD International Conference on Knowledge Discovery and
Data Mining (KDD), 2010.
- Jie Tang, Jimeng Sun, Chi Wang and Zi Yang,
Social Influence Analysis in Large-scale Networks, Proceedings of the Fifteenth ACM SIGKDD
International Conference on Knowledge Discovery and Data Mining (SIGKDD'09), 2009.
- D. Crandall, et al.,
Feedback Effects between Similarity and Social Influence in Online Communities,
Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and
Data Mining (SIGKDD'08), 2008.
- Jaewon Yang, Julian McAuley and Jure Leskovec,
Community Detection in Networks with Node Attributes, Proceedings of ICDM, 2013.
Text Mining
Topic Detection and Tracking
- D. M. Blei, A. Y. Ng, and M. I. Jordan,
Latent dirichlet allocation, J. Mach. Learn. Res., vol. 3, pp. 993–1022, 2003. (Chosen by Shima Khoshraftar).
- D. Blei, J. McAuliffe. . Neural Information Processing Systems 21, 2007
- X. Wang and A. McCallum,
Topics over time: a non-markov continuous-time model of topical trends, in Proceedings of the 12th
ACM SIGKDD, 2006, pp. 424–433.
- C. Wang, D. Blei, and D. Heckerman.
Continuous time dynamic topic models. In Uncertainty in Artificial Intelligence [UAI], 2008.
- L. AlSumait, D. Barbara, and C. Domeniconi,
On-line LDA: Adaptive topic models for mining text streams with applications to topic detection
and tracking, in Proceedings of the 8th IEEE ICDM, 2008, pp.3–12.
- Zhiyuan Chen and Bing Liu, Mining Topics in Documents: Standing on
the Shoulders of Big Data, Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery
and Data Mining, 2014.
Event Detection and Tracking
- Polina Rozenshtein, et al., Event Detection in Activity Networks,
Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2014.
- Maximilian Walther and Michael Kaisser,
Geo-spatial Event Detection in the Twitter Stream, ECIR, 2013. (Chosen by Eunkyung Park)
- Feng Chen and Daniel B. Neill, Non-Parametric Scan Statistics for Event Detection and
Forecasting in Heterogeneous Social Media Graphs, Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery
and Data Mining, 2014.
- Chen Luo, et al., Correlating Events with Time Series for Incident
Diagnosis, Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery
and Data Mining, 2014.
- Mikalai Tsytsarau, et al., Dynamics of News Events and Social
Media Reaction, Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery
and Data Mining, 2014.
- Tim Althoff, Xin Luna Dong, Kevin Murphy, Safa Alai, Van Dang, Wei Zhang, TimeMachine:
Timeline Generation for Knowledge-Base Entities, Sydney, NSW, Australia, 2015.
- Jakub Piskorski, Hristo Tanev, Martin Atkinson, Eric van der Goot, Vanni Zavarella,
Online News Event Extraction for Global Crisis Surveillance, Lecture Notes in Computer Science, Volume 6910, 2011.
Opinion Mining (Sentiment Analysis)
- C. Tao et al.,
User-Level Sentiment Analysis Incorporating Social Networks, KDD'11, 2011. (Chosen by Dekun Wu).
- Murthy Ganapathibhotla and Bing Liu.
Mining Opinions in Comparative Sentences. Proceedings of the 22nd International Conference on Computational
Linguistics (Coling-2008), Manchester, 18-22 August, 2008.
- Xiaowen Ding, Bing Liu and Philip S. Yu.
A Holistic Lexicon-Based Appraoch to Opinion Mining. Proceedings of First ACM International Conference on Web
Search and Data Mining (WSDM-2008), Feb 11-12, 2008, Stanford University, Stanford, California, USA.
- Yu, X., Liu, Y., Huang, X. and An, A., Mining Online Reviews for
Predicting Sales Performance: A Case Study in the Movie Domain ,
IEEE Transactions on Knowledge and Data Engineering (TKDE), 24(4): 720-734, 2012.
Learning from Imbalanced Datasets
- PNrule: A New Framework for Learning Classifier
Models in Data Mining (A Case-Study in Network Intrusion Detection), Ramesh Agarwal and Mahesh V. Joshi,
2001.
- Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P, SMOTE:
Synthetic Minority Over-sampling TEchnique, Journal of Artificial
Intelligence Research, 16, 2002, 341-378.
- Siong Thye Goh and Cynthia Rudin, Box Drawings for Learning with Imbalanced Data,
Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2014.
Parallal and Distributed Data Mining for Big Data with MapReduce
Recommender Systems
- Gediminas Adomavicius and Alexander Tuzhilin, Toward the Next Generation of Recommender
Systems: A Survey of the State-of-the-Art and Possible Extensions, IEEE Transactions on Knowledge and Data
Engineering, Vol.17, No.6, June 2005.
- Jinoh Oh, Wook-Shin Han, Hwanjo Yu, Xiaoqian Jiang, Fast and Robust Parallel
SGD Matrix Factorization, Sydney, NSW, Australia, 2015.
- Antonino Freno, Martin Saveski, Rodolphe Jenatton, Cédric Archambeau,
One-Pass Ranking Models for Low-Latency Product Recommendations, Sydney, NSW, Australia, 2015.
- Konstantina Christakopoulou, Filip Radlinski and Katja Hofmann,
Towards Conversational Recommender Systems, Proceedings of KDD'16, 2016.
Web Mining
- Sundaresan, Neel and Yi, Jeonghee Yi (2000).Mining the Web
for Relations, Proceedings of the 9th International World Wide Web Conference on Computer Networks: the
International Journal of Computer and Telecommunications Networking. Amsterdam, The Netherlands, pages: 699-711
Online. Accessed January 21, 2006.
- Larry Page, Sergey Brin, R. Motwani, T. Winograd,
The PageRank
Citation Ranking: Bringing Order to the Web, Technical Report, Computer
Science Department, Stanford University, 1998. (Chosen by Khadijah Alroogi)
- J. Kleinberg, Authoritative
sources in a hyperlinked environment, In Proc. Ninth Ann. ACM-SIAM Symp. Discrete Algorithms,
pages 668-677, ACM Press, New York, 1998.
- Data mining of user navigation patterns,
J. Borges and M. Levene, In Web Usage Analysis and User
Profiling, pp. 92-111. Published by Springer-Verlag as Lecture Notes in
Computer Science, Vol. 1836, 2000.
.