At the price and size of Linode instances, it's cheaper and I feel better knowing I'm in control when I run Redmine+Gitosis. When I had my github account and something went wrong, I couldn't just ssh in and fix it. All I could do is wait.
Now, for some people, I can see how waiting is an easier solution. I personally just don't like it.
(of course, this is reffering to private repos. public repos are still king on github)
It's amazing to me that you feel more confident in your own abilities to keep a server running than GitHub... you must be a really great sysadmin! The GitHub folks are pretty hardcore!
I, for one, am a positively mediocre sysadmin. And the amount of time I have to spend waiting for GitHub to fix bugs is almost certainly far, far, FAR less than the amount of time I'd spend scrambling (but not waiting!) to fix my own.
This is also how I feel about Heroku.
For whatever reason, I prefer the helplessness I feel while waiting for them to fix stuff to the helplessness I feel when I'm in "OH SHIT MY SERVER IS DOWN AND I DIDN'T EVEN DO ANYTHING AND WHY IS THE DATABASE NOT RESPONDING IT RESPONDS WHEN I QUERY IT MANUALLY AAAUUGGHHH!!!" mode.
Although I do try to keep myself sharp by doing things by hand in other areas of my engineering activities.
I don't feel like I could keep Github running as well as most, but for the entire time I've had my personal server running, it only went down once due to a hardware failure on Linode's end. For simple stuff, a simple sysadmin will suffice. Github has thousands of users and cares about performance, I have maybe 5 and I couldn't care less if my server used 2000% more cpu than it's supposed to. Different levels of complexity, you see.
+1 to you for that. We tried github for some of our private projects, but soon switched to a linode+redmine+git/hg. Now we are no rails champs; we all are python guys but setting up redmine(with some plugins was a cake walk) same goes for git/hg. We never needed to hire hardcore sysadmins for this stuff.
PS: We love github very much. its made it so easy to collaborate everyone.
1) You need to keep current on your security patches.
2) You need to upgrade your OS when it reaches end-of-life.
3) You need to make backups, and more importantly, verify that those backups will actually restore. (For git, this is most critical if you have a large team and dozens of repos.)
You can ignore a server for years, but eventually you'll get compromised or loose a hard drive. (If you use RAID, you'll eventually lose or an entire RAID array. Not fun at all, nor cheap.)
I'd say "most developers" already have a box running some kind of website where they are doing this anyway, but even if you disagree I think you have to be willing to cede "many", especially given how many people who use GitHub are currently doing Ruby on Rails or node.js work. I'd even go so far as to say that those that currently aren't /should/, as the experience being a sysadmin is important when understanding how other sysadmins will react when they see how your software is deployed, which I guess is another topic of discussion that comes up often here (oft filed under the "Debian vs. Ruby" banner).
(Part of me is wondering if many of the more controversial discussions on this site are between people who have sysadmin experience (and considered it valuable) and people who don't.)
Part of me is wondering if many of the more controversial discussions on this site are between people who have sysadmin experience (and considered it valuable) and people who don't.
I've done a fair bit of sysadmin work over the years, mostly in self-defense, because I want fewer crises when a critical development server eats itself.
Lots of people can figure out how to install Ubuntu, or rent a Linode. But if they don't master upgrades, patches and backups, they'll eventually end up paying a real sysadmin a lot of money at the worst possible moment.
If all you're doing on your server is git, then the first two points you make are covered basic apt-get usage. It's super easy.
point 3 isn't that necessary if all you're doing is git over ssh. All the people that have a checked out copy have a backup of your repo. Also, if you're only doing git over ssh then it's not hard to make the box relatively secure. ssh will be hte only open port, and it won't be on the standard port.
Maybe for a few months. When you run a server for 2 or 3 years the distro goes out of support, software upgrades are getting behind, before you know it you are compiling patches from source, ...
I've done this several times over the last 5 years and am now finally moving everything to specialized hosting services. Just hosting a web site, photo repo, mail and svn on a machine (vps) is enough to make it a serious hassle Moving all those to specialized services costs the same, is less work and more reliable.
> Maybe for a few months. When you run a server for 2 or 3 years the distro goes out of support, software upgrades are getting behind, before you know it you are compiling patches from source, ...
You can get 5 years security updates support with Ubuntu Server LTS and you can set up safe unattended updates[1] with email notification if anything ever requires your attention. If you set your backups right as well, you can get away even with catastrophic hardware failures.
Remember, this is not some gigantic scale, it shouldn't be that difficult.
I'm not saying it can't be done, I'm just saying that my experience has been different. Another example, I used to have my own installation for our websites and bug tracking and time sheet management software, on a locally-hosted machine with apache and mysql and a bunch of other software. The number of hours I've spend migrating data between versions, tweaking mod_rewrite rules, setting up backups for various databases and other data, making usage statistics etc. - I don't even want to think about it.
Recently I moved to a shared hosting server. The yearly cost of that is covered with one hours of my hourly rate (and I'm not even expensive). I can set up 50 or so different applications with a few clicks in a web UI, and upgrades are handled by it, too. Backups are taken care of, I have a web ui for dns, ssl, everything. It was like a breath of fresh air.
To each his own and each situation is different, of course. But there's something to be said for division of labor.
> The number of hours I've spend migrating data between versions,
It seems (from prev. post as well) like you were trying to stay on the cutting edge functionality wise. OTOH I was arguing for running stable and maintained Linux distribution.
I'd certainly agree that the initial investment is pretty big—not only it takes some time, but you need non-trivial specific knowledge about Linux administration—and for that reason alone I'd advocate using maintained hosting. My only beef with your comment was that, in theory (and in my experience), a properly configured VPS running stable software shouldn't require nowhere near the level of maintenance you seem to be suggesting.
Sure, but I figured we were talking about small organizations. One-man shop to maybe 10 people or so, or even more, anything too small to have a dedicated sysadmin.
Sorry, but there is no such thing as "safe unattended updates" when you have anything more complex than your home desktop box. There's way too much that can and will go wrong to be that naive.
Safe unattended means no configuration files are altered and those updates are not adding new functionality in the first place anyway. It's only Package X.Y with security patch applied.
Actually your home desktop box should be much more difficult to upgrade than a simple generic server box with generic virtualized hardware drivers.
There are valid reasons to use service providers or maintained hosting but properly configured (and backed up) VPS running stable (& maintained) software can get you a long way.
2) when it's out of date, buy a new $20/month vps, install git and ssh. Move your repo and turn off the old server. This might take 2 hours, every 5 years. Not a big deal.
Yes, I can confirm that it's surprisingly easy to run git with gitosis over ssh on your own server for a few dollars. Takes a couple of minutes to setup. And of course you can make it public and you can share it with other people with keys.
What gives you that idea? I run several cronjobs that deal with their service, and it's down a lot compared to other similar services. My jobs deal with that gracefully, but if I wanted to I could easily set up a more reliable service than GitHub if my only goal was to host *.git repositories, not have any of the other nice things they provide.
And the fact that they have somewhere north of 5 terabytes of active data, and their job queue is handling many hundreds of thousands of jobs a day, and it pretty much all happens extremely fast with nary a hiccup. I know there are outages, but I'm pushing up changes to them something like 10 hours a day every day, and I've never noticed them.
They've struggled and dealt with a number of hard engineering problems, which they've blogged extensively about. In particular, their architectural (both hardware and software) changes since leaving Engine Yard demonstrate more than a simple "working knowledge" of how to build large-scale systems.
A major selling point of Git is the distributed part. You are allowed (even encouraged) to have more than one "source of truth". For ~7$/mo (including admin. and cheaper than Linode), GitHub is just another place to have a hosted version of your repo, with a nice UI and social features.
Shit happens, servers go down, that's why you also have a remote repo hosted on Linode, and X, and Y too.
This is an interesting theory, but it isn't how git's client actually works. If the repository exported a list of "mirrors" to the client that it then stored and was willing to use, that would be awesome, but otherwise you have a million people out there who are now just getting error messages when they do "git pull" and the only fix is for them to go back to your web page and try to get information on what is happening and where else they can switch their origin. Meanwhile, if you do your own hosting you can just update your DNS to point to another box and no one is the wiser.
They key problem, frankly, is that GitHub conflates two entirely unrelated things: a nice UI and social features, and a hosted version of your repo. I love the idea of outsourcing a nice UI and having cool social features, and /maybe/ to make those features work they need to have a mirror of my repository (I'm not convinced), but when people go to pull it the URL listed should be the actual upstream "I own the DNS on this and feel I can make this stable in the long term", not the GitHub mirror.
This is a straw man argument. The question isn't whether Git could be more intuitive/user-friendly (Hint: It should be, in fact I bet my company on it [see my profile]), or whether it is more secure/cheap to host your own repo.
If you have a million people `pull`ing from your repo, of course you should have be hosting your own public access point. But, in 80% of cases, people can't be bothered to figure out how to set up Gitosis, pay for slices, mess with DNS, etc. just to host a repo.
This comment is totally unrelated and is itself a strawman. Yes, it is easier to use GitHub: I will not argue that fact. However, using GitHub will cause people to be pulling from GitHub, and GitHub may go down. This is a tradeoff, and is one people use a lot: you use a shared platform and give up control of the URL to get easier outsourced hosting. But to argue that git's decentralization solves that problem is disingenuous: it means that people could theoretically still pull your repository, but only after finding out what that fallback URL is and manually resetting their origin, which 90% of git users don't even know how to do. Meanwhile, many people are willing to spend the five minutes it takes to learn how to run their own server and want to avoid this tradeoff by hosting their own stuff on their own hostnames so they can publish stable URLs, but /can't/, because they like GitHub's social features, none of which (due to the aforementioned distributed features, humorously) actually require GitHub to be the canonical repository URL: if you want to use GitHub, you are going to have people cloning and pulling your GitHub mirror (or even worse: adding your GitHub mirror as a submodule) and when it goes down they are going to get errors, and you will have no control over it. That sucks.
And sometimes web services go down for good. Again, this is an understood tradeoff, and I'm not arguing that. What I do argue with is that "git is distributed" does not cancel out this particular tradeoff, which I'd the statement that was made by the person I am responding to. An actual solution used by many other services is "let me use my own hostnames with this service", which GitHub does not support for your repository, as While their fundamental value comes from the social features and nifty git UI, they seem to mentally be stuck in a "we are the git hosting company" mindset.
You can't really fault Github for individual teams not opting to host their code in more than one spot online, even if Github doesn't offer the capability for users to use their own domain name for seamless switching of git hosts.
Does Github encourage keeping everything centered at Github? Perhaps implicitly. But they certainly don't lock anyone's data in, so blaming them for their customers opting to NOT put their code anywhere besides Github seems unfair.
You are conflating "hosting in multiple locations" with "claiming to be a canonical URL". If I choose to host my repository at git.saurik.com, but want to be able to use GitHub's repository browsing features, social timelines, etc., I may choose to /also/ host a copy at GitHub.
However, people are now going to copy/paste the GitHub repository URL and use that to clone my repository, and that URL is going to end up as a large number of peoples' origins. Even worse, that URL may end up in third party projects as a submodule (which is much more difficult to retroactively change).
Again: the problem here is not that GitHub is somehow encouraging people to keep things at GitHub "centrally": it isn't, and the goal is not to have your data in multiple places.
In fact, that's what you need to /avoid/: there should be a single URL for "this is the git repository that we consider to be the official, canonical source for our (distributed) contributions to this project".
That URL should be one that you feel comfortable you can maintain for a long time, as that URL can end up baked into a lot of things. Some of them are theoretically easy to change (the million users who are pulling from that URL, assuming they know how to do that without just re-cloning), and some of them aren't (usages of your project as submodules in other peoples' projects).
To quickly put this in another, maybe simpler manner: the problem isn't that people aren't choosing to /also/ put their code in places other than GitHub, it is that putting you code /also/ in GitHub undermines your git repository URL.
But git-over-http is just a normal http client, and http supports redirects. So while nobody does this, it's possible to load-balance http clients to "valid" servers just like you would with any other http-based app.
ssh:// and git:// are more difficult, but project contributers with commit access can just ping you on irc to see what's up with the repo and where to push to today.
"GitHub is just another place to have a hosted version of your repo, with a nice UI and social features."
The main point of Github is the social aspect; git just makes that easier.
They were other public git hosts before, but they didn't get the traction of GitHub because they didn't offer the same magic as The Place to share code.
I'm a student, and I make a new git repo for each assignment or class. Github would become extremely expensive if I was to do this.
And my model for working on git projects is not distributed. We use a centralized model (which is totally viable and one of the many uses) instead of pull requests and branches. On small projects, I find this faster and easier for people to use.
Yes, I have backups, but when my centralized repo goes down, it's annoying to tell people to start pushing and pulling to a USB drive. Workflow is the issue, not losing work.
Now, for some people, I can see how waiting is an easier solution. I personally just don't like it.
(of course, this is reffering to private repos. public repos are still king on github)