More

nuntius · on Oct 3, 2019

Very cool. Seems to be another tool in the same vein as halide-lang.org, now extended for sparse data.

nuntius · on Nov 17, 2018

Dependency graphs can also be annotated with the resources required for each node or edge. Resources are estimated by summation, time is estimated by finding the critical path.

Gantt charts are a partial linearization of a dependency graph. People tend to fixate over the dates shown, allow schedule pressure to "correct" previous estimates, etc.

According to some early users of Gantt charts, they are a great tool for summarizing a plan, a terrible tool for developing and maintaining one.

nuntius · on Nov 11, 2018

Of course H1Bs get and accept the same offers. That's exactly the issue.

Don't tell me you never knew the H1B who knowingly accepted a low salary because that was their chance to move to the US, get their food in the door, and hopefully move up. I've known many, from both Asia and Europe, who told me exactly this.

Companies knowingly use this to keep wages lower than the US-based supply would otherwise demand. At one level, I support this -- it maintains a competitive edge.

So what's wrong? Unlike a citizen or permanent resident, H1Bs feel indentured. They are even more risk averse. If you believe in free trade, then the freedom of both parties must be protected. Both H1Bs and undocumented immigrants feel far from protected. There are both moral and economic problems with the arrangement.

nuntius · on Dec 23, 2017

If you emulate a dozen machines on one physical machine, then a single exploit can traverse them all. If you pack a dozen "single board computers" in a case and give each a single function, then entire classes of attack are ruled out.

nuntius · on May 10, 2017

Some of the visual SLAM techniques are also getting fairly good results for cheap. In particular, see DSO (same research group as the previous LSD-SLAM).

http://vision.in.tum.de/research/vslam/dso

eerikkivistik · on May 10, 2017

Nice paper, hadn't seen this one before, seems pretty recent.

nuntius · on April 28, 2017

Instantly, as in next line of code? Yeah, probable human error (or auto-generated code).

Without being used? Not surprising. It is surprisingly common to allocate a variable, pass it to a function, and have the call never actually use the variable. Unwind, and the variable is destructed without use.

I've seen other nasty framework errors exposed by such (non-)usage patterns.

nuntius · on March 23, 2017

Very much a part of the standard. i.e. not specific to SBCL.

Think of defgeneric as the function signature and defmethod as the template specialization. Not sure why you say this is an error in CLOS. Looks fine to me.

That said, most implementations try to auto-infer the generic function metaobject when you use defmethod without defgeneric. SBCL raises a warning.

Good reads on the topic:

http://www.gigamonkeys.com/book/object-reorientation-generic...

https://mitpress.mit.edu/books/art-metaobject-protocol

http://mop.lisp.se/

juki · on March 23, 2017

The DEFGENERIC line is wrong. It can't have a body like that. It should be something like

    (defgeneric xplusone (x)
      (:method (x) (+ 1 x)))

Also, generic functions are slower than regular functions (due to dynamic dispatch), so using them for type optimization would be rather counterproductive.

rurban · on March 23, 2017

Yes, thanks. I only wanted to point out similarities to abstract classes, generic slow methods.

nuntius · on March 19, 2017

Backwards compat requires that both old and new hashes work at the same time. A simple typedef is unlikely to handle all the semantics and space needed for such a change...

It is often hard to generalize when N=1. Now that the N=1 use case is established and we are moving towards N=2, it is painfully obvious to all that a better abstraction is needed.

Typedef or no, we would still need a full audit of the code to find spots where people "inlined" the expansion.

IMO, Linus should have done better here -- no crypto hash lasts forever, but this code is far cleaner than useless layers of abstraction.

jlgaddis · on March 19, 2017

Perhaps you haven't read Linus' comments where he stated (more than a decade ago) that the usage of SHA1 here isn't for "security"?

(Hint: that's why GPG signing commits is an option.)

nuntius · on March 19, 2017

I read those comments more than a decade ago. They seemed weak but tolerable then. They seem broken now. Git is supposed to guarantee that the code I see is the code the author saw, in a distributed and decentralized environment. This is Git's entire reason for existing.

A secure design is essential for trusting this functionality. My trust in Git has always been tempered by the weakness of SHA1.

A GPG signature is no stronger than its object ref.

Have you seen how many frameworks believe "auto-pull and compile deps by hash from github" is reasonable? They are assuming this isn't a massive attack vector. They are trying to build on a core feature that Git claims to have.

Recent events moved this from probably foolish to provably so.

scrollaway · on March 19, 2017

When you GPG sign a commit, you just GPG sign its hash, you're not signing its diff alongside it.

cormacrelf · on March 19, 2017

That's what comes to mind every time someone brings up Linus' comments from way back when. If SHA-1 is insecure, then there is no way to have security. Forge an object, and GPG sign its commit, and you have broken the apparent security GPG signing was meant to bring. If SHA-1 was not meant for security, then security must have been a non-goal of Git.

The comments are brought up usually to explain why Linus didn't think much of it at the time, whereas they actually demonstrate the shift of thinking around what Git is meant to provide. Security is definitely a goal now, and the hash function is the critical piece of security infrastructure.

glandium · on March 19, 2017

GPG signatures actually sign the hash digest of the text they're given. Fun fact, which I think (hope) changed in recent versions of GPG: the hash, by default, is (was?) SHA-1.

One can check what is used with e.g.

  $ git cat-file -p $some_tag | gpg --list-packets | grep "digest algo"

The output is of the form

  digest algo n, begin of digest xx yy

Where n can be:

  1: MD5
  2: SHA1
  8: SHA256
  10: SHA512

(See RFC 4880, 9.4 for all values)

scrollaway · on March 20, 2017

Interesting, I didn't know! Although it makes a lot of sense now that you bring it up.

I don't think it changes anything though, because of git's integrity. Stop me if I'm getting this wrong but, if you wanted to attack a signed git commit through the gpg signature's hash, you would have to modify the commit object itself... which yields a different commit hash in order to be valid. You'd have to get absurdly lucky to have a signature collision that contains a (valid) commit hash collision.

glandium · on March 20, 2017

The text that GPG signs on a git tag is:

  object $sha1
  type commit
  tag $name
  tagger $user $timestamp $tz

  $text

If you wanted to attack a signed git commit through the gpg signature's hash, you would have to do a second preimage attack on that text with a different commit sha1.

OTOH, if you wanted to attack a signed git commit through the git commit sha1, you would have to do a second preimage attack on that commit text, which is of the form:

  commit $length\0
  tree $sha1
  parent $parent_sha1
  author $author $author_timestamp $author_tz
  committer $committer $committer_timestamp $committer_tz

  $text

See where I'm going? it's the same kind of attack.

Another way to attack it would be to do a second pre-image attack on the pointed tree, which is harder because there is not really free-form text available in a tree object.

Yet another way to attack it would be to do a second pre-image attack on one of the blobs pointed to by a tree, where the format is of the form:

  blob $length\0$content

I don't think this is significantly easier than any of the second pre-image attacks mentioned above.

So, in fact, in any case, to attack a gpg signed git tag, you need a second pre-image attack on the hash. If git uses something better than SHA-1, but GPG still uses SHA-1, the weakest link becomes, ironically, GPG.

That being said, second pre-image attacks are pretty much impractical for most hashes at the moment, even older ones that have been deemed broken for many years (like MD5 or even MD4 (TTBOMK)).

That is, even if git were using MD4, you couldn't replace an existing commit, tree or blob with something that has the same MD4.

Edit:

In fact, here's a challenge:

Let's assume that git can use any kind of hash instead of SHA1. Let's assume I have a repository with a single commit with a single tree that contains a single source file.

The source file is:

  $ cat hackme.c
  #include <stdio.h>

  int main() {
    printf("Hack me, world!\n");
    return 0;
  }

So that we all talk about the same thing, here is the raw sha1 for this source:

  $ sha1sum hackme.c
  cffc02c09faf2e9a83ecbb976e1304759868cf1c  hackme.c

And its git SHA1:

  $ git hash-object hackme.c
  36134c8c8e9fdf705441dcc1f71736064afc7c44

Here is how you can create this SHA1 without git:

  $ (echo -e -n blob $(stat -c %s hackme.c)\\x0; cat hackme.c) | sha1sum
  36134c8c8e9fdf705441dcc1f71736064afc7c44  -

or

  $ (echo -e -n blob $(stat -c %s hackme.c)\\x0; cat hackme.c) | openssl sha1
  (stdin)= 36134c8c8e9fdf705441dcc1f71736064afc7c44

And for git variants that would be using MD5:

  $ (echo -e -n blob $(stat -c %s hackme.c)\\x0; cat hackme.c) | openssl md5
  (stdin)= 1b56dbc6613ff340b324ca973aec67f9

Or MD4:

  $ (echo -e -n blob $(stat -c %s hackme.c)\\x0; cat hackme.c) | openssl md4
  (stdin)= 0eaabfc1a32629dce98c476f591c3f60

The challenge is this: attack the hypothetical repository using the hash of your choosing[1] ; replace that source with something that is valid C because people using the content of the repository will be compiling the source. Obviously, you'll need the hash to match for "blob $length\0$content" where $length is the length of $content, in bytes, and $content is your replacement C source code.

1. let's say, pick any from the list on http://valerieaurora.org/hash.html

I posit you'll spend a lot of time and resources (and money) on the problem, (exponentially more so than Google did with SHAttered) except for Snefru.

scrollaway · on March 20, 2017

I was only talking about git commits though. For tags we agree, as the tag is only a pointer (https://twitter.com/Adys/status/835595116110823425).

But for the commit it's different, because the $text in your example affects the hash of the commit itself. And my understanding is that if you sign the commit, you're signing both the contents and the hash of the content. Am I incorrect?

glandium · on March 20, 2017

If you sign the commit, you sign the exact text I quoted, where $text is what you pass to the `-m` argument to `git tag`.

IncRnd · on March 19, 2017

Yes, Linus wrote that SHA1 isn't here for security, but that was a glaring misunderstanding of security on his part. Integrity protection of source code is a security function.

palunon · on March 20, 2017

I think it's mainly due to a different threat model. Linus only pulls from his trusted lieutenants, who are unlikely to try to attack the source in that way (it's way easier to simply hide a bad commit in the lot, no need to fiddle with SHA1). They do the same.

The rest of the code is sent through mailing lists as patches, so the hash is irrelevant.

SHA1 here protects against "random" corruption (which is more than some VCS do), but not an attacker. At no point one is able to send trusted contributors bad commit objects.

Now, the use people have of git is very different from the kernel (or git) style, so their threat model is different, and SHA1 may become a security function.

IncRnd · on March 27, 2017

I understand your point. However, that doesn't take into account defense in depth which says that more than a single control should be in place.

tedunangst · on March 19, 2017

Well, if SHA1 isn't for security, there's no reason to switch away from it today.

nuntius · on Feb 9, 2017

Fully agree regarding your points about micro payments. Patronage, monopoly-priced copyright, and advertising have all proven to have bad side-effects.

IMO, the market may be acting rationally here. I have tried Octave several times, and always moved on with little regret. Genuine Matlab is tolerable for some prototyping tasks. A slow clone of Matlab has little value outside of academic environments where Matlab compatibility is required. Programmers use their favorite language, and others use a spreadsheet.

nuntius · on Feb 9, 2017

Matlab has always been a quirky language. It has good libraries for common math operations, good visualization tools, good documentation, and toolboxes available for many special tasks. There are a number of common pitfalls, with well-known workarounds, that cannot be fixed without breaking backwards compatibility. The performance of a "for loop" is usually dog slow (some special cases were optimized recently). You often spend hours figuring out how to vectorize your code so it runs decently fast. OOP and other techniques have been bolted on in a workable but unusual way, and they are not widely used. Licensing costs add up.

People looking for a free Matlab replacement often gravitate to a full replacement, rather than a clone. In the past, this meant learning something like C++, a significant hurdle to migration.

Python is generally seen as a much better general-purpose language than Matlab. It is building a much larger developer community. It is no harder to learn than Matlab, and it supports similar interactive development. Tools like numpy start adding easy support for Matlab's core strengths. Python is free, even in commercial deployments.

Matlab's primary competitor used to be Excel. Python is emerging as a very real threat. Outside of the core Matlab user base, preference for Python (or Go or ...) is building fast.

SandB0x · on Feb 9, 2017

> The performance of a "for loop" is usually dog slow (some special cases were optimized recently). You often spend hours figuring out how to vectorize your code so it runs decently fast.

A similar story for Python and NumPy on this point, mitigated by NumPy's elegant broadcasting rules. Agree with everything else!

BeetleB · on Feb 9, 2017

>A similar story for Python and NumPy on this point

Not as similar. When I was in grad school and encountered NumPy, I ported over my Matlab code to NumPy. It was hard to vectorize and so I had an explicit for loop (in both code bases).

NumPy ran 7x faster.

Did not really drill down to whether it was the loop itself or the operation in it, but I suspect it was the loop itself.

Matlab really has a slow for loop.

sedachv · on Feb 9, 2017

Matlab/Octave is an array programming language. You are supposed to use array and matrix operations. If you are using a for loop you are writing Matlab code wrong.

BeetleB · on Feb 9, 2017

I think you're missing the point of the thread.

Obviously, you should vectorize your code and avoid explicit loops. In both Octave and NumPy

However, what the thread is pointing out is that if you do have explicit for loops, Octave is much slower than NumPy.

For the code I was writing, I had trouble vectorizing a particular piece of code. I'm sure if I spent many extra hours, I'd figure out some way. But why should I when it's 7x faster in NumPy without vectorizing?

And whatever way I would discover for vectorizing it in Octave would likely work for NumPy as well. It'll be a rare exception that NumPy is slower than Octave.

For me, the only reason to use Octave is if someone hands me Matlab legacy code. For my scientific computation, NumPy/SciPy has always been superior.

Ironically, Octave usage is inherently tied to Matlab's existence. Were Matlab to disappear, Octave would have little use when it comes to developing new code. I cannot think of a good reason for someone new to the field to develop in Octave, other than the need to interface with others who use Matlab.

I know this comment sounds critical, but I'm not anti-Octave. When I first started using it, I was very impressed, and I learned far more about Matlab by using Octave and reading Octave's docs than any docs from Mathworks. As a Matlab clone, it is superb.

It's just that now the alternatives to Octave are superior. Personally, I would like to see development continue for Octave, just because I appreciate it so much as an open source project. But the reality is when I give new young scientists advice, it is almost always "Use NumPy/SciPy".