How Swift could potentially be faster than Objective-C

mpweiher · on July 4, 2014

Very potentially.

The things that can be "faster" are typically not the bottlenecks in Objective-C programs, as the fine article rightly mentions in passing.

On the other hand, a lot of the really slow parts of recent Objective-C "enhancements" are now mandatory in Swift, for example the Automatic Reference Counting (ARC) support, which is actually a big part of the 100x and more slowdowns.

Another issue is that you have copying semantics by default. The claim is that this can make things faster, but for common objects like arrays and dictionaries it first makes things a lot slower (once they fix the completely borked array semantics).

In all of these cases, you have very slow defaults that you must trust the compiler to make fast again. The claim is that the semantics make it easier for the compiler to optimize. This is true, but so far practical experience appears to be that these two don't balance, that is the compilers don't get enough extra information for optimization to compensate for the loss in performance you get from the semantics being expensive by default.

Except in well-chosen benchmarks.

In fact, even in FP languages that have much tighter semantics and therefore much better opportunities for the "sufficiently smart" compiler to do its magic, you tend to see people dropping down to nasty imperative code (effectively C in all but name) to get decent performance.

Swift's semantics are less tight, so I think it's going to depend even more on the programmer writing fast, imperative code, "C in all but name". Again, there are a few things that help the compiler here, but again I doubt they will generally be able to compensate.

I think the biggest issue with Swift performance is that it is focused on helping the compiler, rather than the programmer. The performance model is, at least so far, highly non-linear and non-intuitive. This will hopefully get better, but the semantics I've touched on make it difficult, they are inherently highly non-linear and inherently difficult to predict.

josephlord · on July 4, 2014

> Another issue is that you have copying semantics by default. The claim is that this can make things faster, but for common objects like arrays and dictionaries it first makes things a lot slower (once they fix the completely borked array semantics).

But when they say "copy semantics" (actually I think it was "value semantics") they mean copy on write semantics. There are tricks that they can do to ensure the copy is pretty rare. If the array passed to a function is not used later in the calling function no copy should be needed. If it is not changed in caller or callee then no copy is needed.

Yes there may be a hit at times but it should be fairly rare.

I'm very glad that they are fixing the borked array semantics, I complained pretty bitterly about that.

mikeash · on July 5, 2014

When you say 100x slowdowns with ARC, are you talking about Swift or Objective-C? If the latter, do you have a concrete example of that happening? My experience has been that ARC is typically negligible, worst case a relatively small factor, and can be quite a bit faster due to things like autorelease elision if you hit the right situation.

As for "very slow defaults that you must trust the compiler to make fast again", that applies to just about every language that wants to be fast at all. C compilers generate awful code when the optimizer is turned off.

mpweiher · on July 5, 2014

For the 100x slowdowns, I am talking about Swift, though if you want to see a 20x slowdown in Objective-C, look here: http://blog.metaobject.com/2014/06/compiler-writers-gone-wil...

Yes, the optimizer can take care of it some times, most of the time, ..., but phew!

While C compilers also generate pretty bad code when optimizations are off, it's nothing near this staggeringly awful. Unless you manage to write a longish loop that it can figure out, which is unlikely outside of benchmarks, my experience is you get around 2x or so for straight-line code that's pretty rare these days. This is 20x to 100x and more.

The 2x roughly matches Probsting's Law[1], which says that advances in compiler optimizer tech. double performance every 18 years. That's not a lot, and does not justify trusting the compiler this much. In short, making the semantics slow and trusting the sufficiently smart compiler [2] to fix it has never worked in the past, and I don't see any evidence why this time should be different.

[1] http://www.cs.virginia.edu/~techrep/CS-2001-12.pdf

[2] http://c2.com/cgi/wiki?SufficientlySmartCompiler

mikeash · on July 5, 2014

Are you measuring a 20x slowdown with optimizations turned off? That's the only way I can reproduce your results. That's a completely meaningless comparison if so.

mpweiher · on July 8, 2014

Well, no, it's not meaningless in the context of TFA:

http://blog.metaobject.com/2014/06/compiler-writers-gone-wil...

Wevah · on July 5, 2014

In my experience, ARC is only much slower under -O0, where the ARC optimizer isn't run.

thegeomaster · on July 4, 2014

This is basically the speed advantages of C++ applied to Swift, a more powerful language. C++ also uses vtables and compilers often inline calls to methods if the method/type can be determined at compile time.

The argument as to why it can be faster than plain C is very weak, though. It says that if C programmers don't use the restrict keyword, equivalent Swift code can be faster. This is right, but ignores the fact that careful and competent C programmers will put restrict in all the places where performance matters. It's very hard to be faster than plain C, because it has a long, long history, and as a result, the compilers have been steadily improving for a long time. It is also less powerful allowing you to be closer to the actual assembly code that is to be generated.

But with increasingly better hardware, the speed gains of plain C are diminishing. Yes, you can be very fast, but is it worth the trouble of expressing yourself in such a low-level language?

Also, since you can't type-pun (alias) in Swift, you can't do some things that can be done using plain C, so it is unclear how it can be faster.

stephencanon · on July 4, 2014

> Also, since you can't type-pun (alias) in Swift, you can't do some things that can be done using plain C, so it is unclear how it can be faster.

Type-punning is never necessary in C; it can always be replaced with one or more calls to memcpy( ), which the compiler can elide (clang, at least, is quite good at this optimization). I don't see any reason why this approach wouldn't work with Swift.

pcwalton · on July 4, 2014

LLVM isn't as good at memcpy optimization as you might think. The MemCpyOptimizer pass isn't ordered very well in the pipeline and can't make use of the SSA infrastructure that the rest of LLVM uses.

Source: I'm fighting this right now in rustc and plan to fix it by just making the frontend emit fewer memcpys.

stephencanon · on July 4, 2014

It's been a few years since I've seen clang/llvm fail to elide a memcpy used in place of type punning (it's an idiom that I use very frequently as a library writer, and care a lot about). If you have examples of cases where llvm fails to do this optimization, please file a bug report and I will be delighted to lean on the right folks to get it fixed. You shouldn't need to work around it.

PeterGriffin · on July 4, 2014

If you read carefully, it's not just the advantages of C++ applied to Swift, because Swift avoids certain weaknesses of C which C++ can't (like said aliased pointers etc.).

Swift is much more strict in terms of typing than either C or C++. Pointers are safe, there's no null, etc. It's the strict typing which translates to the compiler knowing more about your program before it runs, and therefore it can do more transformations without changing intent.

joliv · on July 4, 2014

A very relevant article: http://c2.com/cgi/wiki?AsFastAsCee

clayallsopp · on July 4, 2014

Interestingly, Swift (as of now) can be orders of magnitude slower than C/Objective-C: http://stackoverflow.com/questions/24101718/swift-performanc...

I hope/expect the out-of-box performance will improve over time, but I guess it shouldn't be taken as fact that Swift will always be faster than its predecessor.

tedchs · on July 4, 2014

Nice reinforcement of "performance tricks" almost always equaling "get the computer to avoid doing extra unneeded work that it used to be programmed to do".

moomin · on July 4, 2014

To be honest, right now Haskell's looking like a higher performance language than Swift.

xooyoozoo · on July 4, 2014

Because of actual semantic analysis or some microbenchmark you saw somewhere running on a beta, unstable compiler?

personZ · on July 4, 2014

and even claimed to be faster than C for certain cases

-a promise made for pretty much every language ever. Just need a couple of compiler tune ups. Not trying to be critical, that just gave me a chuckle.

On most architectures (including x86-64, ARM, and ARM64), the first few parameters to a function are passed in registers

To add to this, the Microsoft 64 calling convention is that the first four integer parameters (which includes pointers), and the first four floating point values, are passed in registers. The System V AMD64 ABI (used by Linux and others) passes the first 6 integer and the first 6 floating point parameters in registers.

ARMv8 convention is that the first 8 integer/pointer values, and the first 8 floating point values (so up to 16 values) are passed via registers.

This isn't all win, of course -- if your code was using those registers, to hold local variables for instance, it needs to save and restore them, or simply avoid them. In most cases it would be the latter given the abundance of registers.

The whole aliasing bit was just weird and detracts from the piece. Anyone who is doing pointer-based-operations in base C is using restrict as appropriate.

pjmlp · on July 4, 2014

>and even claimed to be faster than C for certain cases -a promise made for pretty much every language ever. Just need a couple of compiler tune ups. Not trying to be critical, that just gave me a chuckle.

I remember the days when C was just yet another systems programming language deemed too slow for any serious work.

EDIT: Always nice to be downvoted by hipsters that lack coding experience from the days C was UNIX only.

chton · on July 4, 2014

while true, that doesn't make it a bad comparison. Languages like Swift are one step up on the foodchain, they provide more abstraction. Comparing to a popular language one step down is a good way of showing the cost (or lack thereof). I'm sure 10 years from now we'll all be going "Language X is faster than Swift in certain cases!"

csl · on July 6, 2014

> I remember the days when C was just yet another systems programming language deemed too slow for any serious work.

I've heard that remark several times. Which languages were faster than C back then, except assembly? Fortran? Algol? BCPL?

pjmlp · on July 6, 2014

Assembly of course. All higher level languages were still 2nd place to Assembly and had similar execution times across the board.

cwzwarich · on July 4, 2014

Most C/C++ compilers don't propagate `restrict` after inlining, so you lose a lot of optimization potential even if you use `restrict`.

aktau · on July 4, 2014

I had no idea about this, I had mostly assumed that a decent compiler would do this. There is no reason this can't be implemented in the future, right?

cwzwarich · on July 4, 2014

It is difficult to implement correctly because the rules for `restrict` in the spec are quite nonintuitive; it doesn't simply mean that the two pointers don't alias each other.

The guarantee provided by owned pointers in Rust is stronger than that of `restrict` in C, so we should be able to propagate the must-not-alias constraints better upon inlining.

kibwen · on July 4, 2014

In the interest of full disclosure, I should point out that the Rust compiler currently isn't conveying aliasing information to LLVM at all (or if it is, it's very minimal). If anyone's interested in implementing it, it would be both an interesting project and immediately catapult you into Rust godhood. :)

bluecalm · on July 4, 2014

This is very interesting! Do you have some code examples or deeper explanation of this by any chance ?

cwzwarich · on July 4, 2014

Well, I'm not a C language lawyer myself, so I'm merely going off of what I have heard from people who are, but consider the following C function:

  bool f(int *restrict a, int *b, bool should_modify) {
    if (should_modify) {
      *a = b;
    }
    return a == b;
  }

Does this function have defined behavior? It turns out that the C spec says that the aliasing restrictions imposed by `restrict` only apply if the memory object pointed to by the restricted pointer has an intervening modification in the scope of the `restrict`; otherwise, the pointers are allowed to alias. Therefore, this function does not have defined behavior for all inputs. If a and b point to valid memory objects and a != b, then the behavior of f is defined regardless of the value of should_modify. If a == b, then f only has defined behavior if should_modify is false.

Yes, this is arguably insane, but it's also the spec.

personZ · on July 4, 2014

Can you describe in more details what you mean? An inlined function with restrict parameters is of course inlined as if the variables are restricted. Indeed, a test of an inline of an inline of an inline was still optimized (or rather, not guarded) per the restrict.

cwzwarich · on July 4, 2014

Consider the simple loop

  void f(int *a, const int *b, unsigned size) {
    for (unsigned i = 0; i < size; ++i) {
      a[i] = b[i] * 2;
    }
  }

This is the LLVM IR generated by `clang -Ofast -c -o - -emit-llvm -S`:

https://gist.github.com/zwarich/21cf3a0b2387302ea48b

Notice the `vector.memcheck` block, which is doing a dynamic aliasing check to decide whether to use the vectorized code. If I add `restrict`, e.g.

  void f(int *restrict a, const int *b, unsigned size) {
    for (unsigned i = 0; i < size; ++i) {
      a[i] = b[i] * 2;
    }
  }

then that dynamic check goes away:

https://gist.github.com/zwarich/48fd374becf5f891cb5a

However, if I put the vectorized loop in a function that gets inlined, e.g.

  static void f(int *restrict a, const int *b, unsigned size) {
    for (unsigned i = 0; i < size; ++i) {
      a[i] = b[i] * 2;
    }
  }

  void g(int *a, int *b, unsigned size) {
    f(a, b, size);
  }

the dynamic aliasing check reappears:

https://gist.github.com/zwarich/3e75f17de835dd9064ee

This is because LLVM's loop vectorizer runs after inlining and the standard optimization passes (which are run in a bottom-up traversal of the SCCs of the callgraph, interleaved with inlining). The `noalias` attribute on function parameters, which is what represents `restrict` in LLVM IR, disappears upon inlining. Many other compilers do something similar, although some like IBM's XLC can preserve `restrict` upon inlining.

mikeash · on July 4, 2014

Did you look at the aliasing examples beyond the memcpy reimplementation? It can affect perfectly normal code where you'd never think to use restrict.

corysama · on July 4, 2014

Exactly. Yes you can use restrict. And, you probably will when writing benchmarks. But, this isn't about speeding up benchmarks. This is about speeding up real world production code with real world production deadlines. Are you really going to intelligently and correctly apply restrict to every function you write where it could help in your next commercial product? Do you really expect everyone on your team to do the same and never screw up?

Then there's the teeming masses of app developers who aren't quite as awesome as you. You don't care about them, but Apple does. Apple knows most of them couldn't restrict their way into a box ;) But, they still churned out a billion apps that are used by billions of users. Doing whatever we can to trick them into speeding up their code will literally save untold lifetimes of cumulative lag for the users.

mikeash · on July 5, 2014

I just want to say how much I enjoy the phrase "trick them into speeding up their code".

DrJokepu · on July 4, 2014

> a promise made for pretty much every language ever. Just need a couple of compiler tune ups. Not trying to be critical, that just gave me a chuckle.

I was under the impression that C's qsort is often beaten by languages with support for templates / generics? I'm pretty sure they were talking about qsort.

bluecalm · on July 4, 2014

Yeah, but you can easily get C implementations of qsort which beat standard library one on most inputs by 2x/3x. For example here: http://www.ucw.cz/libucw/. Speed of sorting on modern machines depends mainly on what is in cache so even if you have some very clever implementation (like 3way split in qsort in recent Java) it won't help you if elements are not close to each other in memory. If you are sorting structs vs objects which are somewhere on the heap you already lost the performance battle so the compiler needs to very smart to make up for this.

erichocean · on July 5, 2014

so the compiler needs to very smart to make up for this

Since compilers don't control malloc, it's pretty hard for them to do any kind of optimization that relies on the physical locations of distinct objects in memory.

Not impossible, but even a very smart compiler would find it difficult to do things that an above average programmer could easily do up front, by using a better architecture/approach to begin with.

artificialidiot · on July 4, 2014

If you compare standard library approaches between languages, yes, ones which let you inline calls will beat callback through pointers in thight loops, which is comparison function in qsort. Most of the time it is inconsequential. You could have used C preprocessor if you really wanted.

josephlord · on July 4, 2014

> and even claimed to be faster than C for certain cases

-a promise made for pretty much every language ever. Just need a couple of compiler tune ups. Not trying to be critical, that just gave me a chuckle.

It can be true although the "certain cases" may be quite a small set and there are likely to be many cases where C is faster than language X (where X includes Swift).

The whole aliasing bit was just weird and detracts from the piece. Anyone who is doing pointer-based-operations in base C is using restrict as appropriate.

Is that really true? Sure in highly optimised low level code it can be true. But in general code when can the compiler know that there is no aliasing. Isn't this the reason that Fortran can be faster than C in many cases?

chton · on July 4, 2014

These aren't exactly secrets, are they? Vtables and similar optimizations have been used in Java and .Net compilers for a long time. For a good explanation in C# you can read http://netmatze.wordpress.com/2013/04/06/dynamic-dispatch-in...

mikeash · on July 4, 2014

Well no, they're certainly not actual secrets. I used the word for two reasons: first, to convey that which mechanisms are actually used in Swift aren't well known yet, and secondly and much more importantly, to make the title alliterative.

chton · on July 4, 2014

Alliteration is always a good reason :) I shouldn't have been so short with you. I've just seen one to many articles where Swift is made to be seen as Apple's godly gift to software development, without any references to similar languages. Not that yours is that ofcourse, it just triggered that frustration a bit :)

mikeash · on July 4, 2014

I understand and share your frustration. This article is my attempt to counter the "Swift is super fast because rainbows and unicorns" attitude floating around.

chton · on July 4, 2014

A knight for the good cause, then :)

mikeash · on July 4, 2014

And for the record, I didn't see your comment as "short". You explained some good context, and I just thought I'd explain my word choice in turn, especially since I could joke about it.

chton · on July 4, 2014

Record noted. Now I find it unfortunate that the title here was changed, we went through all this text for nothing :)

mikeash · on July 5, 2014

Hey, the original title is still on my blog, still counts!

dang · on July 4, 2014

I'm sorry to spoil your alliteration, but in addition to this not being secrets, it isn't speed either (yet), as the second paragraph makes clear. So (as the HN guidelines ask) we changed the title to a phrase from the article that seems more accurate and neutral.

mikeash · on July 4, 2014

I disagree with the characterization as "baity", as I never write with any intent to attract links. However, your site, your rules, and I don't care what title you use here.

dang · on July 4, 2014

Fair enough. I'll edit that out.

mikeash · on July 4, 2014

Thanks. I'm generally in favor of facts over cleverness, I just couldn't resist here.

CountHackulus · on July 4, 2014

So it's not really the secrets of Swift's speed, it's more like not doing what made Objective-C slow.

gress · on July 4, 2014

Objective-C isn't slow. It's a strict superset of C, so anything that C can do fast, Objective-C can do just as fast. If you say Objective-C is slow, you are also saying C is slow.

Objective-C provides a message passing mechanism that trades speed for convenience. It turns out that programmers often choose convenience over speed because of time pressure.

erichocean · on July 5, 2014

Going further, writing just C code to replicate what the Objective-C runtime actually does (dynamic dispatch) is also slower than Objective-C and it's hand-tuned, written-in-assembler runtime at those same tasks.

So, Objective-C (at least on Apple's operating systems) is as fast as C AND faster than C at a few highly specific things.