Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Comes with a single PCIe 3.0 x16 link on die, good for 16Gbyte/s. Kaveri was the first PCIe3 capable chip, good to see that in server-land too.

There are a couple people out and about for whom 8GB/s just wasn't enough- dual port IB-FDR, & storage controllers are right at that threshold.

Apparently the PCIe competes with one of the HT channels? Kind of under the impression one is limited to 2P if using the on-chip PCIe, but not 100% on that.



Comes with a single PCIe 3.0 x16 link on die, good for 16Gbyte/s.

This is the area where Intel is just killing it with their E5 chips, along with being able to write directly to the L3 from I/O. (I have no idea if AMD does this.)

The E5 is so good that it lets you do entirely different architectures from what came before it. Total game changer.


> The E5 is so good that it lets you do entirely different architectures from what came before it.

As an example: Luke Gorrie is one such person who is actively talking about doing so by talking directly to Ethernet controllers via DMA from user space. Here he is in a 30 minute talk about exploiting 512 Gbit/s of PCIe in his project called Snabb Switch. He's even written a 10 Gbit/s Intel Ethernet driver in Lua. The idea, as far as I can tell, is you can turn a common Xeon server in to a very low latency, zero-copy, multi-gigabit, software defined, layer 2 network appliance.

https://cast.switch.ch/vod/clips/26uo9i576i/

https://github.com/SnabbCo/snabbswitch/wiki


Intel seems stuck at 2P and HT still has massively lower latency but 80 GByte/s worth of PCIe lanes is huge. As big as main-memory throughput huge! Hence DDIO, which you reference, which allows IO to write to cache and skip the historic data-path it used to take through main memory. AFAIK AMD doesn't have anything equivalent. And they only have 16 lanes on chip: the rest come out of io-hubs.

I'd love to see someone actually try and use all that Intel PCIe IO and report on how utilized those pipes can get. Perhaps someone wants to send the PacketShader people a box loaded with GPUs? That'd be great, thanks!

http://shader.kaist.edu/packetshader/


Cool project! I wonder if you'd get similar perf from CPUs if you could used Intel's ISPC compiler[0] with the same GPU algorithms. I've found that GPU algorithms also perform substantially better on plain old CPUs, IMO because they use memory bandwidth more effectively.

I too would like to see how far those PCI Express busses can be pushed. :)

BTW We're adopting Intel's DPDK[1] approach to get massive packet processing performance on a single machine. So far we're liking it, but we'll see as it's not in production yet.

[0] http://ispc.github.io/ [1] http://www.intel.com/content/www/us/en/intelligent-systems/i...


I don't follow hardware too closely, but I'm under the impression the new processors have ridiculously complicated architectures now. Integrated graphics on die, PCI bridges, write-through cache... I remember back in the day it was Processor / Northbridge / Southbridge. Is it still the case? In which direction are they going? System-on-a-chip?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: