This is an advanced course in data engineering for graduate students. The course will cover a wide range of topics in data engineering research, such as query evaluation and optimization; indexing; data integrity and concurrency control; distributed databases; spatial databases; database and information retrieval; XML and semi-structured databases; security and privacy; web services and AJAX; etc.

Time and Location:

Class: MW 12:30pm -1:45pm in 3154 Learned
Instructor: Bo Luo (bluo<at>eecs<dot>ku<dot>edu)
Office hours: MW 2:00pm - 3:00pm, 2044 Eaton


There is no required textbook for this course. The following books may be used as reference:
Database Management Systems (3rd Edition), by Raghu Ramakrishnan and Johannes Gehrke. McGraw-Hill, 2002. ISBN: 0072465638
Database Systems: The Complete Book, by Hector Garcia-Molina, Jeffrey D. Ullman, and Jennifer Widom. Prentice Hall. 2002. ISBN: 0130319953

Tasks and grading

1. Area paper: work in groups of 2 to 3. Select an area, write a nice literature review, and present to the class. (survey 15%, presentation 15%)
2. Research paper: each student will be assigned one research paper (different from your survey area), work individually, read in details and present to class.(30%)
3. Critical review: each student will be assigned two research papers (again, different from your survey area), read it, write critical reviews, and discuss with presenter. (20%)
4. Peer review: each student will be assigned one area paper (written by your peers), read in details and give a professional review. (10%)
5. Class participation: join discussions! (10%)

A: 85+
B: 70-84.5
C: 60-70
D/F: 59-

Schedule (subject to change)

presenter review
M 08/25: Introduction
W 08/27: Introduction (cont'd): a brief history
Ref: Codd, E. F. "A relational model of data for large shared data banks." CACM 13(6), 1970.

M 09/01: No Class!
W 09/03: Introduction to XML

M 09/08: XPath, XQuery (color version)
W 09/10: RDBMS-based XML Database management systems

M 09/15: Research paper: H. Lu, J. X. Yu, G. Wang, S. Zheng, H. Jiang, G. Yu, and A. Zhou: "What makes the differences: benchmarking XML database implementations", ACM Trans. Interet Technol. 5, 1, 2005. (slides, color version) Harish Namrata, Vincent
T 09/30 Research paper: S. Börzsönyi, D. Kossmann, and K. Stocker: "The Skyline Operator", ICDE 2001. (slides, color version)

Madhuri Tsam Kai, Phuong
M 09/22: Area presentation - Distributed DB (slides, color version)
Sudha, Manohar  
W 09/24: Research paper: J. B. Rothnie Jr., J B R Jr, P A Bernstein, S Fox, N Goodman, M Hammer, T A Landers, C Reeve, D W Shipman, and E Wong: "Introduction to a system for distributed databases (SDD-1)", ACM TODS 6(4) 1981.(slides)

Yaling Kannan, Aravind
M 09/29: Research paper: M. Stonebraker, P. M. Aoki, R. Devine, W. Litwin, M.l Olson: "Mariposa: a new architecture for distributed data", ICDE 1994
Ref: M. Stonebraker, P. M. Aoki, W. Litwin, A. Pfeffer, A. Sah, J. Sidell, C. Staelin, and A. Yu: "Mariposa: A Wide-Area Distributed Database System", VLDB J., 5(1), 1996. (slides, color version)
Peidi Ankit, Zeeshan
W 10/01: Area presentation - Web database (slides, color version)

Lin, Vincent, Phuong  
M 10/06: Research paper: S. Brin: "Extracting Patterns and Relations from the World Wide Web", WebDB, 1998. (slides, color version) Jerome Harish, Namrata
W 10/08: Research paper: Jon M. Kleinberg: "Authoritative Sources in a Hyperlinked Environment", JACM 46(5), 1999. (slides)

Martin Madhuri, Tsam Kai
M 10/13: Research paper: T. Cheng, X. Yan, and K. C.-C. Chang: "EntityRank: Searching Entities Directly and Holistically", VLDB 2007.(slides, color version)
Sudha Yaling, Kannan
W10/15: Area presentation - Streaming Data (slides, color version)

Aravind, Zeeshan, Namrata  
M 10/20: Research paper: Y. Diao, S. Rizvi, and M. J. Franklin: "Towards an Internet-Scale XML Dissemination Service", VLDB 2004. (slides, color version) Manohar Peidi, Ankit
W 10/22: Research paper: A. Markowetz, Y. Yang, and D. Papadias: "Keyword search on relational data streams", SIGMOD 2007.

Lin Jerome, Harish
M 10/27: Area presentation - Security and Privacy Tsam Kai, Kannan  
W 10/29:

Research paper: P.G. Griffiths and B. Wade: "An Authorization Mechanism for a Relational Database System," ACM TODS, vol. 1, no. 3, pp. 242-255, 1976.

Ashok Lin, Jerome
M 11/03: Research paper: The Virtual Private Database in Oracle9ir2: An Oracle Technical White Paper http://otn.oracle.com/deploy/security/oracle9ir2/pdf/vpd9ir2twp.pdf
Vincent Martin, Madhuri
W 11/05: Research paper: M. Murata, A. Tozawa, M. Kudo, and S. Hada: "XML access control using static analysis," ACM CCS 2003.

Phuong Sudha, Yaling
M 11/10: Research paper: R. Agrawal, A. Evfimievski, R. Srikant: "Information Sharing Across Private Databases", SIGMOD 2003.
Aravind Manohar, Peidi
W 11/12: Area presentation - Data Cleaning
Harish, Madhuri  
  Research paper: M. A. Hernández , S. J. Stolfo, The merge/purge problem for large databases, ACM SIGMOD 1995

Namrata Vincent, Martin
M 11/17: Area presentation - Information Retrieval
Yaling, Peidi  
  Research Paper: R. Goldman, N. Shivakumar, S. Venkatasubramanian, H. Garcia-Molina: Proximity Search in Databases. VLDB 1998
Tsam Kai Phuong, Shantan
W 11/19: Area presentation - Data Mining Ankit, Shantan, Ashok  
M 11/24: Research paper: R. Agrawal and R. Srikant, "Fast Algorithms for Mining Association Rules", VLDB 1994. Zeeshan Sudha, Lin
W 11/26: No Class - Happy Thanksgiving!

M 12/01: Research paper: T. Zhang, R. Ramakrishnan, and M. Livny: "BIRCH: An Efficient Data Clustering Method for Very Large Databases". SIGMOD 1996. Kannan Aravind, Manoha
W 12/03: Area - Database Service for Sensor Networks

Jerome, Martin  
M 12/08: Resarch paper: Y. Yao and J. Gehrke: " Query Processing for Sensor Networks," CIDR2003.
Shantan Shantan, Ashok
W 12/10: Research paper: S. Madden, M. Franklin, J. Hellerstein, and W. Hong: " The Design of an Acquisitional Query Processor for Sensor Networks," ACM SIGMOD 2003. Ankit Zeeshan, Ashok