- title:
- Querying Data: Foundations and Practice
Requêtes sur les données : fondements et pratique - manager:
- Leonid Libkin
- ects:
- 3
- period:
- 2
- hours:
- 24
- weeks:
- 8
- hours-per-week:
- 3
- language:
- English by default
- lang:
- track:
- B
- themes:
- Databases Logic/Proof
- order:
- 2.26.1
- link:
- view
- [2.26.1]
- Querying Data: Foundations and Practice
Requêtes sur les données : fondements et pratique
- Language:
- Period:
- 2.
- Duration:
- 24h (3h/week).
- ECTS:
- 3.
- Themes: Databases Logic/Proof
- Manager:
- Leonid Libkin.
Querying Data: Foundations and Practice (24H, 3 ECTS)
Coordinator : Leonid Libkin
(Under construction.)
Context
Database theory has undergone several important transformations which are rarely reflected in courses, while having already found applications in database practice. In the 1980s and 90s the backbone of database theory was finite model theory, which has now been developed to the point of a collection of off-the-shelf tools that can be applied to analyze complexity and expressiveness of query languages. Latter times brought in semi-structured data and with it techniques based on word and tree automata. The last 10-15 years have brought us three new directions:
1. At a purely theoretical level, a uniform treatment of several different areas such as relational joins and conjunctive queries and incomplete data via a rich theory of graph homomorphisms.
2. At a more practical level, the theory of worst-case optimal join algorithms, that made it from theory papers to mainstream systems in less than a decade, and led to rethinking of some principles on which SQL databases are built.
3. At even more practical level, the emergence of graph databases as a key new industry, both native ones and represented via relational, all requiring a different approach to language design.
Outline
Below is a preliminary breakdown into lectures, with 8 lectures of 3 hours each. The FIRST DAY OF CLASSES IS 20 DECEMBER. We are likely to skip the class on 31 January as well as the instructor will be away at a conference.
- Lecture 1. Foundations of relational databases: First-order logic, Relational algebra, Codd's theorem, Static analysis (Trakhtenbrot’s theorem).
- Lecture 2. Joins and conjunctive queries; graph homomorphisms; evaluation, static analysis, minimization and graph cores.
- Lecture 3. Fast conjunctive query evaluation: AGM bound and worst-case optimal joins.
- Lecture 4. Expressiveness and complexity of query languages: locality, zero-one law, basics of descriptive complexity.
- Lecture 5. SQL programming, as seen through foundational lens.
- Lecture 6. Incomplete information: theory and real life.
- Lecture 7. Graph querying: theory (RPQ and similar), practice (Cypher/GQL and SQL/PGQ).
- Lecture 8. Relational graph querying (Rel).
Documents
The main source is Database Theory: Querying Data by Marcelo Arenas, Pablo Barceló, Leonid Libkin, Wim Martens, and Andreas Pieris.
The book is quasi-finished and freely available at https://github.com/pdm-book/community.
This will be supplemented by
- Elements of Finite Model Theory by L. Libkin (freely available to students in class)
- Graphs and Homomorphisms by P. Hell and J. Nešetril
- SQL/Cypher/Rel freely available documentation and guides
Evaluation
Research-oriented take-home exams. Details to provided during the first lecture.
Prerequisites
Technically speaking, just good general CS background, although this being a Master-level database class covering new topics, it would be helpful to have some basic understanding of the classics, e.g. first-order logic, relational algebra, some knowledge of SQL.