Subinterpreters existed from the very early days in the C API and were key to the implementation of mod_python (which I wrote). So if you used mod_python, you used subinterpeters without realizing it.
DISTINCT generally requires the results to be sorted which has O(n^2) worst performance so it can have a big performance hit on a query. It is best to make your database structure such that queries only return distinct data. E.g. by disallowing duplicates
If your sorting algorithm degrades to anywhere near O(n^2) in pathological cases, you're doing something wrong. And even if it's just a kind of timeout/operations-limit to detect pathological cases and just run an in-place mergesort instead. Tail latency/containing pathological data is quite important if there's any interactivity.
Nearly every time, it's a symptom of bad data normalization.
But every time, it interferes badly with any kind of locking (that's DBMS dependent, of course), and imposes a high performance penalty (on every DBMS).
In order to determine the distinct items, the items need to be deduplicated. Generally that's done in only two ways: a hash table that skips items already seen, or a sort followed by a scan that skips over duplicates. The hash table is O(1), but the sort is easier to make parallel without sharing mutable state and has more established algorithms to use when spilling to disk.
Bingo. I used to work with a guy who would see duplicate results and just throw a distinct on his query. I had to keep on him to fix his queries or explain why distinct was correct in this case. My default is that distinct is almost always not the solution.
In your example you must also have another table, 'sample' with all the samples. So yes, you would use an exists or in subquery with the table you suggested.
Github was cool when git was new years back - but these days, and especially given how git inherently is not centralized, it is not very clear to me why we all cling to github. With a little work, all that it offers can be done without any help of a centralized server/corporation.
Pretty much, yes. It's kind of spelled out in the Nakamoto paper. From the introduction:
" In this paper, we propose a solution to the double-spending problem using a peer-to-peer distributed timestamp server to generate computational proof of the chronological order of transactions."
Everything else in Bitcoin is just turning that timestamp server into a practical(ish) system.
Yes, that's why non PoW distributed ledger like the XRPL work essentially the same by only sorting Tx by time and then use a federated byzantine agreement to "filter out" Tx that did not propagate trough the network in a specific time and thus cant be put in correct order. These Tx will be added to the next ledger(block) instead which isn't a problem if block times are just seconds.
Well, it relies on a synchronized clock, so it can't provide a clock. PoW adjusts the difficulty based on the time to try to meet some difficulty target.
In fact, if the time of nodes is not synchronized, it can cause significant problems and vulnerabilities. If time is too fast, the difficulty adjustment algorithm will think it mined too few blocks and decrease the difficulty.
Those aren’t really “significant problems and vulnerabilities”: any given node can lie about what time it is, but you’re not trusting a particular node for more than the outcome of a single contiguous block—and block difficulty “velocity” is capped—so you’d need a Sybil attack to actually walk the difficulty down. Otherwise, even at 49% malicious nodes, consensus is just going to bounce between nodes that say the time was really short, and nodes that give “regular” timestamps, keeping the difficulty roughly constant within the network’s margin of error.
Really, the timestamp field in most PoW systems’ “block” structs (Bitcoin’s, Ethereum’s, etc.) is just defined as “a number that is higher than the one in the parent block, and not so high that when interpreted as a POSIX timestamp it would land 30+ seconds in the future relative to the local node’s time.” So you just need >50% of the nodes to have a ±30s clock sync in order to agree on which blocks are valid for consideration; and even if you don’t have that level of synch, those blocks will still become valid eventually, once they’re old enough that all the nodes do consider them to be in the past. (And most PoW systems keep around near-“future” blocks until they’re valid for just such a case.)
The timing aspect is an important part of PoW, but it's not the entire purpose. The Bitcoin whitepaper itself goes on to say, "The proof-of-work also solves the problem of determining representation in majority decision making." That is, PoW also solves the problem of deciding which consensus rules to enforce, not just when.
I bet a bunch of PR's were built up from the weekend which weren't deployed, and some guy who came in at 9 decided to deploy them and broke things. Always scary to be the one to deploy if no one has deployed in a while.
I once submitted a blog post [1] and later received an email from someone at HN saying that it was a great article but didn't do so well and if I re-submit it, they will make sure it does better, so I did and it went to the top.
They do it frequently. I've had it happen several times. The specific text isn't particularly exciting. They just give you a link to resubmit if you're interested. You also get an extra upvote upon submission (and I imagine there is more of a bump behind the scenes).
What does HN feel about this? Is it curation from the staff or is it selective manipulation? On first blush, I'm for it... but I'd be interested to see what others think.
at first thought I'd be against it. Because they essentially bump what they think is good, and it does not (necessarily) reflect the community. Sometimes things slip through at HN though so giving a nudge to resubmit sounds like a good idea.
Tbh, I don't really mind either way, I've enjoyed most content on here.
We bump what we think the community might like and is aligned with the site guidelines. And only to the lower half of the front page, whence it soon falls away if we guessed wrong. The posts aren't necessarily what we ourselves like or think is good; mostly we don't have time to decide on that.
It makes me feel worse. But I'd like to hear why you say that HN somewhat is an echo chamber, and what you think would make it less of one, or where you might look for examples of non-echo-chambers on the internet. You're welcome to reply here, or email hn@ycombinator.com.
It's possible that you've observed something that's not on our radar, and such information is important for us to be aware of, even if we can't solve it by moderation.
I guess a good community needs constant upkeep, like anything else in life. The community was created over what a few people thought was good content, and if left alone will probably get dissolved in some "large reddit" effect.
I've had that happen a couple of times. The email is fairly bland - it looked like the sort of copy you'd write for an automated system. It includes a link to this comment by way of explanation:
the way it works is described in this HN post https://news.ycombinator.com/item?id=11662380 I've had about a dozen of them, I think probably because I'm in a time zone that gets buried more than others?
on edit: maybe it's not exactly the same thing because I don't have to resubmit when I get these, they are just put in the second chance pool.
The same happened to me just a couple of days ago. I thought it was a very nice thing for the quality of the content that goes to the top (not just because it was my submission, of course).
And Alibaba did it a couple years ago with Pedis! https://github.com/fastio/pedis
(looks like the project is still active, but I don't really know of anyone else who uses it.)
http://modpython.org/live/current/doc-html/pythonapi.html#mu...
EDIT: And it looks like I had subinterpreters in the first released version in May 2000, so the initial git (formerly SVN) commit already had them https://github.com/grisha/mod_python/blob/9b211b7e8a65f1af4b...
EDIT2: Just noticed this comment: