conor-23's comments

conor-23 · on Jan 31, 2025

One of the Hydro creators here. Ballista (and the ecosystem around Arrow and Parquet) are much more focused on analytical query processing whereas Hydro is bringing the concepts from the query processing world to the implementation of distributed systems. Our goal isn't to execute a SQL query, but rather to treat your distributed systems code (e.g a microservice implementation) like it is a SQL query. Integration with Arrow and Parquet are definitely planned in our roadmap though!

conor-23 · on Jan 31, 2025

There is a nice talk on Youtube explaining the Hydro project (focused mostly on DFIR)

https://www.youtube.com/watch?v=YpMKUQKlak0&ab_channel=ACMSI...

conor-23 · on Jan 31, 2025

One of the creators of Hydro here. Yeah, one way to think about Hydro is bringing the dataflow/query optimization/distributed execution ideas from databases and data science to programming distributed systems. We are focused on executing latency-critical longrunning services in this way though rather than individual queries. The kinds of things we have implemented in Hydro include a key-value store and the Paxos protocol, but these compile down to dataflow just like a Spark or SQL query does!

conor-23 · on Jan 31, 2025

There is a nice article by David Patterson (who used to direct the lab and won the Turing Award) on why Berkeley changes the name and scope of the lab every five years https://www2.eecs.berkeley.edu/Pubs/TechRpts/2013/EECS-2013-... . Unfortunately, there's no good name for the lab across each of the five-year boundaries so people just say "rise lab" or "amp lab" etc.

irq-1 · on Jan 31, 2025

Interesting.

> Good Commandment 3. Thou shalt limit the duration of a center. ...

> To hit home runs, it’s wise to have many at bats. ...

> It’s hard to predict information technology trends much longer than five years. ...

> US Graduate student lifetimes are about five years. ...

> You need a decade after a center finishes to judge if it was a home run. Just 8 of the 12 centers in Table I are old enough, and only 3 of them—RISC, RAID, and the Network of Workstations center—could be considered home runs. If slugging .375 is good, then I’m glad that I had many 5-‐year centers rather than fewer long ones.

(Network of Workstations > Google)

conor-23 · on July 4, 2023

This dudes blog is fire. Very nice explanations of complex database topics.

gavinray · on July 4, 2023

Justin Jaffray is a gem

conor-23 · on Feb 15, 2023

A researchy perspective: Datalog was invented to extend relational algebra with recursion. Since it started out as an academic tool, people have been studying recursion-specific optimizations you can do for decades so it is extremely well suited to recursive use-cases e.g. iterative graph algorithms. Using Datalog for network algorithms won the thesis award in databases almost 20 years ago https://boonloo.cis.upenn.edu/papers/boon_interview.pdf .

tejtm · on Feb 15, 2023

This is the answer I subscribe to. CTEs and recursive CTEs are SQL's answer to a limitation of plain relational algebra; no loops.

CTEs are a great and most welcome addition to SQL but they are a bolt-on patch as compared with Datalog where it is a core feature.

felixyz · on Feb 16, 2023

> Datalog was invented to extend relational algebra with recursion.

I'm not sure that is exactly right. Do you have a reference? (Not trying to put you on the spot, I'm just curious to learn the history!)

refset · on Feb 16, 2023

Agreed, my understanding is that Datalog has a distinct (though related) lineage that directly emerged from Prolog (i.e. logic programming, not relational algebra / database theory) - skimming the introduction of "Horn Clauses and the Fixpoint Query Hierarchy (1982)" seems to confirm this: https://dl.acm.org/doi/pdf/10.1145/588111.588137

Edit: this presentation describes things differently but it doesn't sound quite right to me "Chandra and Harel - 1982 Studied the expressive power of logic programs without function symbols on relational databases" https://www.dbai.tuwien.ac.at/datalog2.0/slides/Kolaitis.pdf

conor-23 · on Feb 20, 2023

Yeah I'm not up on the Prolog history side of things. My info is based on the Wikipedia article for Fixed Point Logic: "Least fixed-point logic was first studied systematically by Yiannis N. Moschovakis in 1974,[1] and it was introduced to computer scientists in 1979, when Alfred Aho and Jeffrey Ullman suggested fixed-point logic as an expressive database query language.[2]" [2] = Universality of Data Retrieval Languages : https://dl.acm.org/doi/10.1145/567752.567763

felixyz · on Feb 17, 2023

Thanks!

YeGoblynQueenne · on Feb 17, 2023

Yeah, if the OP can give a reference I'd be very interested, too. I've searched for the "original" reference to datalog because I wanted to cite it, but I couldn't find anything like that.

I have a sneaking suspicion that "function-free Prolog" is as old as ordinary Prolog, and "datalog", as an idea separate to Prolog and used as a database language, was born in the database community, but like the OP I have no reference to this.

conor-23 · on Jan 24, 2023

Frank is a legend. Should put him in the Guinness Book.

conor-23 · on Jan 13, 2023

Author here. Happy to answer any questions. - Conor