Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Ask HN: How many HN members are there?
81 points by d0ne on July 27, 2011 | hide | past | favorite | 65 comments


Why are there so many usernames that are perscription drugs? On the first page I see Zithromax, Zoloft, Zyprexa and Abilify.


These pages rank at the top of the site: search because they have a lot of backlinks. Why? Because someone created the profile, then went out and created links to their profile page (e.g. http://www.opensiteexplorer.org/links?site=http%3A%2F%2Fnews...). Why? Because they were probably planning to drop a link on their HN profile page to a page that sells Zithromax.

This is a common strategy on Youtube or really any social site that gives you a profile page and the ability to drop a followed link. Even if you're building junky links to your HN profile page, the overall domain authority and trust of HN is relatively high, so it's a way to launder link juice and create a quality link to whatever it is you're selling.

Eric Ward wrote about it here: http://searchengineland.com/social-link-manipulation-11429

There's a fairly good sized contingent of the affiliate marketing world that depends on this link building strategy.


Here's a graph I found explaining the "link wheel" strategy: http://lemonarian.com/images/majorwheel.jpg


The sad thing is that it's not just spammers who need to do this. Even people who are genuinely creating great content in an attempt to help others still need to create backlinks from social sites if they want to show up in Google. I've made all sorts of resources that are far better than anything else that currently exists, but without actively going out and building a few backlinks these pages would get literally zero hits.


While the core of what you say is true, I think the overall message, in the context of the comments above you, is slightly misleading. Yes, you cannot just create a great website and magically get hits, you do have to tell people it exists! This isn't news, and "SEO" while much maligned these days is still a valid, useful and in many cases necessary skill. However, it is not necessary to engage in blackhat SEO practices and it is far from true to say that you must build spammy backlinks from social networking sites in order to succeed. I don't think you mean that it is, but it came across a little like that.


Amusingly, your HN alias sounds like some sort of pharmaceutical product. ;)


Oh wow. Thank you for that explanation. It's impressive how many of these profiles were made. This means the total number of real HN accounts is probably significantly less than 28k.


I'd count "real" profiles by requiring that they either post a link, have at least one non-dead comment, or upvote a non-dead comment.


We've seen people who post a link to HN, then have a dozen or so of their fake user accounts up vote that link. Given the economic incentive of being/lingering on the front page there will be folks who try to game it.


a lot of spammers post their website to HN, so that doesn't work either!


And all created 955 days ago.


It appears that most/all were created 1070 days ago and have 1 karma. I suspect that it was probably an attempt to rig the voting of a post.


Probably spam-bots auto-filling forms.


I'd be interested to see some analytics, such as how many users are there with 100+ karma because that's likely a better measure of the number of active users than the number of registered users (many of which are likely inactive). However, that doesn't account for lurkers.

Or, the number of users that have logged in within the past week though this is probably something pg would have to give us because I can't think of any way to determine that without the backend data.


100+ karma won't exactly be a good parameter. I am a regular at Hacker News for around 3 months now , mostly just to read the content & up-vote the ones I really like. But my karma has been at 1 forever now, maybe because I don't comment much, not sure. Still karma does not say much.


Yeah, I agree. Karma shouldn't be an indicator of how often one visits HN, it's merely an indicator of how much one participates in the HN community.

For me, getting karma is not desirable at all. What's the point of it? I only comment, when I think I have something useful to say, which is not often the case. I never felt the urge to post just so that my karma increases. I think, this really reflects the good design of this whole system by pg.


Arguably, silent spectators aren't part of the community they spectate. A bit like watching TV doesn't make you famous.


It's certainly not a direct correlation, but it does help to sort the spam bots and people who created an account but never use it from the rest.

However, I did mention that the karma approach would not account for lurkers which was why I proposed the logged in method. There are many other interesting data points I'd love, but without pg releasing a lot of the info, it's tough (or in some cases impossible) to gather from public data.


I've been following HN for 3+ years now. I would rather read content and upvote then post meaningless responses for a karma score or +5 Funny.


Rough Estimate of users with Karma > 100:

http://www.google.com/search?sclient=psy&hl=en&biw=1...


I have to say that I did not know about the inurl google search parameter.

However, this is indeed a very rough estimate because the parameter 100...100000 also captures the number that indicates how long ago a user account was created, so it would also count accounts that were created a while ago (over 100 days) but were inactive.


I knew about the inurl parameter; it's the range operator that's new to me.


  ..
also works; this is the first I've seen of

  ...
for a numerical range find, which is VERY useful for date searches. A couple years ago, I gave a lesson to a class of mostly senior citizens for how to use Google to do genealogy work. We covered the range find, the inurl: and a couple other obscure but useful search enhancers. Pretty interesting stuff (here's the ODP, if interested) http://www.zentu.net/fmt/searchpresentation.odp


A slightly better search estimate & 10% lower:

http://google.com/search?hl=en&q=site%3Anews.ycombinator...

(search phrase with wildcard and range in it, I didn't even knew this combination was possible)


Slightly more accurate:

http://www.google.com/search?sclient=psy&hl=en&biw=1...

Apparently there are a bunch of pages in the index that just go to "No such user" pages (looks like killed spam accounts).

Unfortunately this search also includes people with accounts created between 100 and 100000 days ago, regardless of karma level. Not sure how to filter that out; you can't use the range syntax in between double quotes, it seems.


I'm working on a side project related to this, but in the mean time,

# of users on HN: http://api.thriftdb.com/api.hnsearch.com/users/_search

Filtering on karma is possible then, and more.


You could just go crawl all the user profiles to get karma stats...


Take a look at Page 10 of search results : http://www.google.com/#q=site:news.ycombinator.com+inurl:%22...


For those too lazy to click the link:

all search results show "no such user"


It would be really nice to have dataset available for researchers with username, their posts and their comments. As an example, we could use the "SPEAR Algorithm"[1] to find expertise in specific domains per HN user.

[1] http://www.michael-noll.com/projects/spear-algorithm/


If pg wouldn't mind me scraping the site, in a manner that would not impact the sites performance, I will write an app to collect this data and publish it to the public domain.


"... If pg wouldn't mind me scraping the site, in a manner that would not impact the sites performance ..."

you don't have to scape the site, RTFM. The official 'hnsearch' api will most likely yield the info you need ~ http://www.hnsearch.com/api


Thanks for point that out. I did not know their was an api for querying data on HN.


Can't speak for pg, but you should probably follow the site's robots.txt file's rules: http://news.ycombinator.com/robots.txt

It sets a crawl-delay of 30 seconds.


wget does this automatically, in case grandparent doesn't know. I'm not going to try it (because I'm not going to analyze the results myself, so it would be wasteful), but I think "wget -r news.ycombinator.com" would take care of this... It will not visit external links without the --span-hosts parameter (I believe).


How come that comes up with a Content-Type: text/html header?


My account page exists, but I rarely post (4 comments); adding my username to the google search finds zero results, so there are at least 28,701, and likely many more than that.

My intuition is that there are many people that don't post at all or only post infrequently.


I also mess up the numbers. HN has so many ridiculous ways to log in, that I forgot a couple of times how I logged in (native, google, open auth, etc.) so I've created four or so accounts now. If I add them up I'm > 100, but no single user has 100 karma.


Wouldn't it just be easier to ask pg to post the data...

pg?


This doesn't account for users who have never submitted or commented, because there would be no way for their profile URLs to end up in Google's index. Or am I missing something? I know that there's the possibility for things to get into the index through the Google Toolbar, but I suspect that the fraction of HN users who use the Google Toolbar is quite low.


Or rarely submit or comment. I am logged in and on HN everyday, but do not show up. Maybe with this comment I will.


Many of the results are 'No such user'. Add 'karma' to the end of the query to get a more accurate guess: around 11,600


If Googlebot is doing its job then there are around 28,700 registered users on Hacker News.


If someone creates an account but never posts, will there ever be a public-facing link the the userid page? If not, then Googlebot could be doing its job perfectly but have a totally inaccurate number of registered users using this method.

I'm curious what the actual number is now...



All of those people have posted comments. That's how. If you have never posted a comment or submitted a link, yet have an account, Google won't find you/


An accurate count is a documented non-feature of Google.


the result counts shown on google are not accurate, however.


Interesting. So, only 522 with over 2000 karma?


The last big poll (the one about hiding comment scores) had about ~4500 votes, IIRC. Everybody who voted is probably somewhat active on here; and I'd guess that most active users voted.


For comparison, pg posted figures on Feb 9 showing 90,000 unique visitors (timeframe unclear to me, but I'm guessing daily) - http://ycombinator.com/images/hntraffic-9feb11.png

If (as the results of my Google search revealed) there are 28,600 members, then 90,000 uniques is roughly a 2:1 lurker ratio (almost certainly much higher, given member!=active member, and not all members visit every day)


True, but don't forget that number of uniques is not the same as number of users, especially if you aren't logged in.

For example: In any given month I visit HN from a laptop, desktop, and work computer using both Chrome and Firefox, in and out of incognito mode, as well as from two different browsers on my iPad and often from other people's computers. Obviously I am not always logged in to all of these browsers so depending on how pg counted uniques (my understanding is that cookies are the common practice though I could be totally off base on this) I could be counted as anywhere from 3 to 15 uniques a month.


I would say at best this gives a rough estimate of the amount of users.

I tried searching my own username and it does not appear in the results.

Perhaps I just failed at using Google?


On a related note, how much of a slashdoting can you expect for posting a link here? There aren't as many registered users as I expected, but it doesn't account for lurkers.

When my niche website is finished, I'd like to share it with the hacker news crowd, but it's not built to scale to a bajillion users.


Google was showing 27,200 results so if they are all valid user id's without repetitions, I would assume around 27,200 registered users but that seems pretty low, doesn't it?


Based on the referrals the site sends (often 4,000+) I would have thought it was much higher than that. My guess is that there are a lot of lurkers & only something like 2% of site visitors even create an account.


That's only users crawled, which is the amount of users that have made at least 1 comment or 1 post -- as there are no user list pages here.


As of 31st May, 2011 there were 67248 HN Users.

Source: http://hnarchive.in


a small improvement: site:news.ycombinator.com inurl:"user?id=" -"No such user"


I like the guy whose name is user-id (:


we are not related


Go back to Reddit.


Let's start a count.

+1


It was just a joke :(


Show of hands would be quicker.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: