Hacker Newsnew | past | comments | ask | show | jobs | submit | conor-23's commentslogin

One of the Hydro creators here. Ballista (and the ecosystem around Arrow and Parquet) are much more focused on analytical query processing whereas Hydro is bringing the concepts from the query processing world to the implementation of distributed systems. Our goal isn't to execute a SQL query, but rather to treat your distributed systems code (e.g a microservice implementation) like it is a SQL query. Integration with Arrow and Parquet are definitely planned in our roadmap though!


There is a nice talk on Youtube explaining the Hydro project (focused mostly on DFIR)

https://www.youtube.com/watch?v=YpMKUQKlak0&ab_channel=ACMSI...


One of the creators of Hydro here. Yeah, one way to think about Hydro is bringing the dataflow/query optimization/distributed execution ideas from databases and data science to programming distributed systems. We are focused on executing latency-critical longrunning services in this way though rather than individual queries. The kinds of things we have implemented in Hydro include a key-value store and the Paxos protocol, but these compile down to dataflow just like a Spark or SQL query does!


There is a nice article by David Patterson (who used to direct the lab and won the Turing Award) on why Berkeley changes the name and scope of the lab every five years https://www2.eecs.berkeley.edu/Pubs/TechRpts/2013/EECS-2013-... . Unfortunately, there's no good name for the lab across each of the five-year boundaries so people just say "rise lab" or "amp lab" etc.


Interesting.

> Good Commandment 3. Thou shalt limit the duration of a center. ...

> To hit home runs, it’s wise to have many at bats. ...

> It’s hard to predict information technology trends much longer than five years. ...

> US Graduate student lifetimes are about five years. ...

> You need a decade after a center finishes to judge if it was a home run. Just 8 of the 12 centers in Table I are old enough, and only 3 of them—RISC, RAID, and the Network of Workstations center—could be considered home runs. If slugging .375 is good, then I’m glad that I had many 5-­‐year centers rather than fewer long ones.

(Network of Workstations > Google)


This dudes blog is fire. Very nice explanations of complex database topics.


Justin Jaffray is a gem


A researchy perspective: Datalog was invented to extend relational algebra with recursion. Since it started out as an academic tool, people have been studying recursion-specific optimizations you can do for decades so it is extremely well suited to recursive use-cases e.g. iterative graph algorithms. Using Datalog for network algorithms won the thesis award in databases almost 20 years ago https://boonloo.cis.upenn.edu/papers/boon_interview.pdf .


This is the answer I subscribe to. CTEs and recursive CTEs are SQL's answer to a limitation of plain relational algebra; no loops.

CTEs are a great and most welcome addition to SQL but they are a bolt-on patch as compared with Datalog where it is a core feature.


> Datalog was invented to extend relational algebra with recursion.

I'm not sure that is exactly right. Do you have a reference? (Not trying to put you on the spot, I'm just curious to learn the history!)


Agreed, my understanding is that Datalog has a distinct (though related) lineage that directly emerged from Prolog (i.e. logic programming, not relational algebra / database theory) - skimming the introduction of "Horn Clauses and the Fixpoint Query Hierarchy (1982)" seems to confirm this: https://dl.acm.org/doi/pdf/10.1145/588111.588137

Edit: this presentation describes things differently but it doesn't sound quite right to me "Chandra and Harel - 1982 Studied the expressive power of logic programs without function symbols on relational databases" https://www.dbai.tuwien.ac.at/datalog2.0/slides/Kolaitis.pdf


Yeah I'm not up on the Prolog history side of things. My info is based on the Wikipedia article for Fixed Point Logic: "Least fixed-point logic was first studied systematically by Yiannis N. Moschovakis in 1974,[1] and it was introduced to computer scientists in 1979, when Alfred Aho and Jeffrey Ullman suggested fixed-point logic as an expressive database query language.[2]" [2] = Universality of Data Retrieval Languages : https://dl.acm.org/doi/10.1145/567752.567763


Thanks!


Yeah, if the OP can give a reference I'd be very interested, too. I've searched for the "original" reference to datalog because I wanted to cite it, but I couldn't find anything like that.

I have a sneaking suspicion that "function-free Prolog" is as old as ordinary Prolog, and "datalog", as an idea separate to Prolog and used as a database language, was born in the database community, but like the OP I have no reference to this.


Frank is a legend. Should put him in the Guinness Book.


Author here. Happy to answer any questions. - Conor


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: