I'm a fifth-year PhD student in Computer Science at the University of California, Berkeley. My advisor is Prof. Joe Hellerstein. My research focuses on distributed systems, logic programming, and data management for large data sets. Recent work includes the Berkeley Orders of Magnitude (BOOM) Project, the Bloom programming language for distributed computing, and MapReduce Online.
In the past, I've contributed to the development of the PostgreSQL database system. I was also an early employee at Truviso, a stream processing company. I received my undergraduate degree in Computer Science from Queen's University in 2007.
Conference and Workshop Papers
- Logic and Lattices for Distributed Programming. With W. R. Marczak, P. Alvaro, J. M. Hellerstein, and D. Maier. SoCC, 2012.
- Confluence Analysis for Distributed Programs: A Model-Theoretic Approach. With W. R. Marczak, P. Alvaro, J. M. Hellerstein, and D. Maier. Datalog 2.0, 2012.
- BloomUnit: Declarative Testing for Distributed Programs. With P. Alvaro, A. Hutchinson, W. R. Marczak, and J. M. Hellerstein. DBTest, 2012.
- Consistency Analysis in Bloom: a CALM and Collected Approach. With P. Alvaro, J. M. Hellerstein, and W. R. Marczak. CIDR, 2011.
- Dedalus: Datalog in Space and Time. With P. Alvaro, W. R. Marczak, J. M. Hellerstein, D. Maier, and R. Sears. Datalog 2.0, 2011.
- BOOM Analytics: Exploring Data-Centric, Declarative Programming for the Cloud. With P. Alvaro, T. Condie, K. Elmeleegy, J. M. Hellerstein, and R. Sears. EuroSys, 2010.
- MapReduce Online. With T. Condie, P. Alvaro, J. M. Hellerstein, K. Elmeleegy, and R. Sears. NSDI, 2010.
- Usher: Improving Data Quality With Dynamic Forms. With K. Chen, H. Chen, J. M. Hellerstein, and T. S. Parikh. ICDE, 2010 (Best Student Paper).
- I Do Declare: Consensus in a Logic Language. With P. Alvaro, T. Condie, J. M. Hellerstein, and R. Sears. SIGOPS Operating Systems Review, Volume 43, Issue 4, January 2010.
- Usher: Improving Data Quality with Dynamic Forms. With K. Chen, H. Chen, J. M. Hellerstein, T. S. Parikh. Transactions on Knowledge and Data Engineering, Volume 23, August 2011.
Demos, Posters, and Perspectives
- Online Aggregation and Continuous Query support in MapReduce. With T. Condie, P. Alvaro, J. M. Hellerstein, J. Gerth, J. Talbot, K. Elmeleegy, and R. Sears. SIGMOD, 2010 (Demo Track).
- Improving Data Quality with Dynamic Forms. With K. Chen, H. Chen, H. Dolan, J. M. Hellerstein, and T. S. Parikh. ICTD, 2009 (Demo Track).
- Continuous Analytics: Rethinking Query Processing in a Network-Effect World. With M. J. Franklin, S. Krishnamurthy, A. Li, A. Russakovsky, N. Thombre. CIDR, 2009 (Perspectives Track).
Technical Reports and Theses
- Confluence Analysis for Distributed Programs: A Model-Theoretic Approach. With W. R. Marczak, P. Alvaro, J. M. Hellerstein and D. Maier. UCB Technical Report EECS-2012-171.
- Transactions and Data Stream Processing. B. Sc. Thesis (May 2007), Queen's University.
- May 2013: Bloom: Big Systems, Small Programs (RICON East)
- March 2013: Disorderly Distributed Programming with Dedalus and Bloom (UCLA)
- February 2013: Disorderly Distributed Programming with Bloom (Université de Nantes, INRIA / LIP6)
- June 2012: Logic and Lattices for Distributed Programming (BashoChat #004) (video)
- February 2011: Consistency Analysis in Bloom (Berkeley OSQ Seminar)
- May 2010: Dedalus: Datalog in Time and Space (Berkeley OSQ Retreat)
- April 2010: BOOM Analytics: Exploring Data-Centric, Declarative Programming for the Cloud (EuroSys'10)
- March 2010: Cloud Programming: From Doom and Gloom to BOOM and Bloom (Datalog 2.0 Workshop)
- December 2009: MapReduce Online
- November 2009: Cloud Programming: From Doom and Gloom to BOOM and Bloom
- April 2009: BOOM: Data-Centric Programming For The Data Center (Stanford InfoLunch)
- October 2007: Query Execution Techniques in PostgreSQL
- June 2006, May 2007: Introduction to Hacking PostgreSQL (tutorial)
- May 2007: Introduction to Data Stream Query Processing
- September 2006: TelegraphCQ: A Data Stream Management System
- May 2005: Inside the PostgreSQL Query Optimizer
- Various notes on DBMS internals
- Written to prepare for an exam in a DBMS internals class I was taking at the time (2003). Covers basic techniques for query evaluation, query optimization, concurrency control and recovery.
- Introduction to Kolmogorov Complexity
- A 45-minute talk that introduces the basic properties of Kolmogorov complexity, and highlights some interesting applications. Note that I'm by no means an expert on algorithmic information theory, so take this for what it's worth.
- Various notes on DBMS internals