What's wrong with using a repeated float to represent a vector/matrix, perhaps with a repeated int for indices if you are using a sparse representation?
Yes, I could take some time and write it myself. Take a look at hdf5[1] and the features they have around meta information, data chunking, page level compression for efficient reading of sub areas of data spaces. It is a non-trivial amount of functionality but allows some pretty amazing performance, for example see what pytables [2] has built on top of it. Now if they had just made it easier to attach it to a wire as well as a file.