Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It's not just china. Enormous effort is put into achieving high H-Index scores everywhere.

I'd love to see a review of all researchers with high H-indexes. I bet you would see a disproportionately high incidence of self citing, citing rings, journal bias, outright corruption and much more.

But the bigger problem is that many of these people are highly intelligent and capable. When they are told that their careers depend on a gameable metric, they can figure out clever ways to game it.

Mountains of paper don't improve the human condition, we need better success metrics.



If you work on a bigger LHC experiment (ATLAS or CMS) you publish about 100 papers a year. I'm on the "author list" for many hundreds of papers that I contributed nothing to and never read. It's actually very difficult to get yourself taken off the author list.

I'm sure my H-index is great, but it's completely bogus. Some organizations have stopped counting papers with over some number of authors (i.e. 1000) which is progress, but by that metric I'm the author on zero papers a year.


> I'm on the "author list" for many hundreds of papers that I contributed nothing to and never read.

Are you asked for permission to be added as an author?

What if the research was 'bogus' or at least parts of it were incompetently/poorly done and the paper is discredited.


We earn the right to be on every paper after working on the experiment for around a year. After that it's automatic: there's theoretically a way to remove yourself from specific papers but no one ever does. It's a convoluted process that has to be approved by the highest ranking person in the collaboration.

The question about bogus science is an interesting one: In theory by putting 3000 authors on every paper the collaboration we are ensuring more scrutiny for every result. And indeed, our internal review is far more rigorous than the peer review that we get from the journal. As far as I know, no journal has ever rejected a paper from ATLAS or CMS, which is a pretty good track record for O(thousands) of papers.

There is a flip-side, of course: this system also hinders innovation. When 3000 people are "authors" on your result, any one of them can to hold it back from publication. We tend to do choose more conservative techniques in the interest of getting anything at all past internal review.

Personally, I don't think aiming for a 100% success rate in publication is a healthy way to do fundamental research. I'd rather see some slightly questionable papers submitted to journals now and then, since lowering the bar to get to that stage would mean making more interesting ideas public.


I understand that advances at LHC rely on a huge amount of people and it would be cumbersome if everyone was fighting to get on papers rather than contributing technically. But once you get above a few hundred authors - outside the team that might understand and care what the paper is about, I'm not sure I'd value everyones or anyones contribution, if I was hiring that person into a new research position

Perhaps its lucky I don't work in physics funding or recruitment.


We have an internal database that keeps track of who contributed where. So in practice when someone is making a hiring decision, they find someone who works on our experiment, and that person asks around or accesses the internal database to see if the candidate really did everything they claimed.


> Some organizations have stopped counting papers with over some number of authors (i.e. 1000) which is progress

You reminded me of this classic paper: https://improbable.com/airchives/classical/articles/peanut_b...


Gordon? You're late.


Been playing Black Mesa recently, what an awesome remake!


Yeah it is stunning, isn't it.


Wait, so all the papers you were part of had more than 1000 authors?


Almost every paper I've been a part of. Some of us will do a few independently, but if we want to use LHC data it has to include everyone on the collaboration on the author list.


Did a quick search and found the LHC papers, all the ones I looked at have around 1K authors:

https://lpcc.web.cern.ch/lhc-data-publications


Alice and LHCb are around 1k. ATLAS and CMS are closer to 3k. There are some very cool experiments at CERN that have less than 100 members, which typically take advantage of either existing LHC interaction points[1] or the accelerator chain that feeds the LHC[2].

[1]: https://faser.web.cern.ch/the-collaboration

[2]: https://base.web.cern.ch/content/people


There are professors at top institutions that publish more than a paper per week. Obviously they are not contributing with much more than just stamping their name as last authors. Least to say, this is completely ridiculous.


Having the citation's impact on one's H-index decay with author position would cause an interesting stir to the "academic game". I wonder if we would start seeing supervisor's bully for first authorship.


As the authors are typically listed alphabetically that would unwittingly have an inhumane result.

(My name moved from the end of the alphabet to the middle when I married. I was amused that it actually makes a difference).


Authors aren’t usually listed alphabetically in academic papers though, typically it is relative contribution (not saying alphabetical never happens though).


It differs by research area. For example, the mathematics authorship convention is alphabetical. Computer science is by contribution.


Computer science theory papers are alphabetical as well.


In large collaborations, this is common. See e.g. https://inspirehep.net/authors/1222902?ui-citation-summary=t...


Also depends on the journal; some have the most impactful author stated last.


H-index should go away, it cannot be fixed. Traditionally, academics have been encouraged to publish in prestigious venues. Metrics like H-index do not take this into account.


Coming full circle much? The h-index was touted as a better metric than judging someone’s paper by the merit of the journal since high profile researchers would get their papers in NSC easily even if it never served a purpose. The h-index is definitely less gameable than you think - the only effective way to game the h-index is to become completely illegitimate in your publishing, using farms like above. That would be easily identified if anyone even vaguely related to the field tries to read the titles and journal names.


The evaluation of scientists and academic staff needs to be done by independent evaluation panels based on a (subjective) assessment of the scientific merits of their publications. In every reasonable place it's also done that way. In addition to this, funding for time-limited contracts has to be treated similar to investment funding, i.e., evaluate very carefully in the beginning and do an internal evaluation of the performance afterwards (mostly to evaluate the initial evaluation) but avoid continuous evaluation and mostly only advise/help during the contract period.

The worst thing to have is indicator counting of any kind. The system will be swamped with mediocre, sometimes almost fraudulent scientists who game it. (It's just too easy to game the system: Just find a bunch of friends who put their names on your papers, and you do the same with their papers, and you've multiplied your "results".)

H-Index is also flawed. In my area in the humanities papers and books are often quoted everywhere because they are so bad. I know scholars who have made a career by publishing outrageous and needlessly polemic books and articles. Everybody will jump on the low-hanging fruit, rightly criticize this work, the original authors get plenty of opportunities to publish defences, and then they get their tenure. Publishers like Oxford UP know what sells and are actively looking for crap like that.


There are moderate tools like multi-round double blind reviews to resolve such issues.

A tool which was used to resolve issues among chemists and biologists is now running rampant over fields which do not have a high volume of citations. Mathematics is suffering, for example.

An Annals of Math paper might have fewer citations than a paper in Journal of Applied Statistics. But the prestige is incomparable.


So what's the problem there? The person going for a job in a maths department with an Annals of Maths citation isn't going to be in a competition with someone with a big H-index because of applied stats papers... the committee won't look twice at the statistician! On the other hand if the Annals of Maths person wants a job in stats then presumably they will also have stats papers (and the stats people will be keen to know.. "what about your amazing career in pure math?!"


This would be the inevitable outcome. Currently the first author did the work, and the last author supported in some way (such as by supervising).

This is purely by convention.


different fields have different conventions regarding what a given authorship position implies in terms of work contributed to the paper. Some place very high weight on the last named author, others on first named, among many other permutations and subtleties. There's no single rubric for deriving relative effort from author name position.

part of this is contingent upon citation formats used in different kinds of publications (and thus different fields) where long lists of authors are condensed to one two or three at most.

this is not even getting into more locally scoped, second order inputs such as any given department's traditional handling of advisor vs. grad student power dynamics.


Look at Didier Raoult in France. He is a perfect example



Those remind me of the entrepreneurs who founded "dozens of companies."

Yeah... maybe if you were on a meth IV drip and lived for 200 years...


Coming up with a good problem to work on is the most difficult part in science.


Quite the contrary, at least in computer science. Tons of good problems, actually developing a solution (and the theory and experiments to back it) is usually orders of magnitute more difficult.

Another point is that to get to this volume the "researcher" probably have many papers that he/she had no part in. Not even formulating the problem. In the case of physicians I encountered senior doctors who just conditioned using the department's medical statistics on adding him/her as author. That is, every paper that uses this specific public medical data adss this author regardless of their contribution... (To be clear they not even necessarily have anything to do with the data collection beyond what they already are obligated to do at the hospit, just happen to be responsible on managing access it)


Good problem also means it fits the reachable skill level of the person doing the work. Do we agree on that stuff like P=NP for example is not a good problem?


This varies a lot according to the area.

Quite common in Biology and Chemistry.

Less common in theoretical physics.


Also luck (says someone with a pathetic H index!) But - I don't care about it, so I can make the observation that of my peers in my 20s the ones that went on to get a massive H-index lucked out with an early paper that became widely cited for a random reason. They introduced a definition, repurposed something that they had done into a fashionable domain, or "got adopted" into a paper by a famous professor and got lots of profile for it. Once they got profile subsequent papers attracted many citations and a virtuous cycle kicked in.

Of course all these people are mega smart and do fab research - but they got lucky, and there were lots of others who were as smart and got pushed out.


> Mountains of paper don't improve the human condition, we need better success metrics.

I'd argue the problem is the concept of "metric" in itself. You can't measure complex human endeavors with a simple objective number. Whenever the concept pops up (lines of code, anyone?), it's always a disaster.


One of these guys is (well, probably) Didier Raoult. The person who started the chloroquine craze for the treatment of COVID. He has an h-index of 148 and over 2300 papers.


What does 148 over 2300 papers means exactly ?


148 is the h-index, which is defined as the maximum value of h such that the author has published h papers that have each been cited at least h times. (from Wikipedia).

My h-index is 13.


is 148 high ? low ? What does in intrinsically measure (beside being a competition of who piss further) ?


Yes it is high. It means you published at least 148 paper each of it cited by more than 147 times. It is supposed to measure how much interesting/useful work you do.


So if you publish a single paper that only gets cited a few times, it'd completely drag you down. That doesn't seem so great honestly.


No, that wouldn't affect your h-index at all.

If you had sixty papers that have each been cited more than sixty times, you'll still have those sixty papers even if you publish a new paper that gets cites only once.


It's very high vs comp sci. 50+ in comp sci is amazing.


its quite high, its hard to compare different fields since in a more active field papers will get cited more often. George Whitesides, one of the most cited chemists is at almost 270.


Why do we need metrics at all?

If I’m sat in Einstein’s office doing a performance review with him, I’d probably say “Yes, Alf, you’ve done pretty well these last few years. Keep it up.” Don’t think I’d need a metric. And if he came to me for funding for a postdoc I’d say - yes, sure.

I don’t know when we decided that _management_ had to be replaced with _metrics_ but it doesn’t seem like a good idea.

Management involves doing things that don’t scale, that’s why we have lots of managers.


This leads to another problem that we (the science community) are already facing: how should we allocate funds between older and more established researchers vs new researchers? Or, who will be the next Einsteins?

If you only fund the more established researchers, then new researchers are starved out and more likely to leave the field. However, my belief is that new researchers are more likely to develop big new innovations than (say) the really old professors that are well past their prime.

You also have another related problem: given a large pool of new researchers, who are the ones that will be really good? Plus, there are other possible goals, like spreading the money around to broaden the base of researchers.


> how should we allocate funds between older and more established researchers vs new researchers?

Reasonable question with no precise answer, but I imagine a manager would seek a balance between the two as with any company or team. Some big hitters but you've got to see where your next Einsteins are coming from.

There's nothing about this that is solved by metrics. Metrics just help you might shallow decisions quickly, and provide ways for academics to game the system by manipulating those metrics.


Einstein solved this problem by doing his best work before joining academia, and winning lifetime funding as a reward afterward.


Einstein was not peer reviewed either. Peer review became standard only later.

That is to say, whatever worked for him or that era wont work for contemporary person today.


Probably because it's much easier judging whether to fund Einstein than whether to fund someone much further down the chain.


I don't see how that is relevant, but let's consider someone further down the chain.

"Ah yes, Jimmy Postdoc, I see you have published 3 papers with an average impact factor of 3.1. Have a promotion."

vs

"Ah yes, Jimmy Postdoc, I see that you're making progress in improving quantum error correction, as evidenced by the fact that we can now use 80% of the previously required qubits to complete Shor's factoring assuming a surface code - great work, you should get that paper out at some point but keep focused on the work for now.

I'm also really pleased with your contribution to the academic community in the dept, particularly helping out Polly PhD student with her SAT formulation of decoding. The constructive questions in her talk really opened up a new line of enquiry. Great job.

Given all the above, have a promotion."

Metrics are stupid, people are smart, stop using stupid.


The subjective valuation lead to quite a lot of nepotism and unfairness. There is techno-babble you can use to pick pretty much anyone if you are good enough with words.


I agree that's a problem of traditional management. The way to counter it is good management methods, in particular ensuring that reviews and promotions are cross-validated using other personnel, both horizontally and vertically. I'm sure you're right though, some bias will slip through.

The question you have to ask yourself is whether you are prepared to tolerate occasional suboptimal decisions for a metrics based evaluation that _corrupts the entire system_.


> I’d probably say “Yes, Alf, you’ve done pretty well these last few years. Keep it up.”

That might work in a company but not in the academic world. Many countries limit the amount of time a PhD student or Postdoc researcher can stay at a university. After that time the person has to find a permanent contract (professor) if they want to stay. Because there are many many more candidates than available positions, the hiring committees try to justify their decisions by objective (haha) criteria.


I am working on a website that will precisely do this! I want to create factors that will show citation rings and self citation ratios! Trying to figure out what to call the metric, my current favourite is CJ factor!


Cool of you to do this, though I wonder if this will be more of a measure of community size. Smaller niche topic communities probably behave more like citation rings (you'd cite everyone else in the community frequently). It also seems like it'd penalize people working in a new area (like if you're the first person to use a method, you'd probably cite yourself a bunch).


I will try my best to control for field specific practices for sure, the RCR metric published by NIH is already a good starting point in this regard. Valid point about not penalising people in new fields, will keep that in mind!


To some extent this already exists

http://www.vanityindex.com/


While I agree that we need better metrics (and, more in general, a better academic system) I think that publishing fake research (committing fraud) is still way more severe than committing H-index hacking. The latter mostly translates into publishing lots of papers that present small advancements and does encourage scientists to work on topics where there is very little risk of failing (rather than chase challenging topics). These are have long-term negative implications for the systems and for our society as a whole, but, at the end of the day, it's still real research.


Actually the problem is that funding agencies and universities started to implement success metrics. I mean you are trying to measure something which is inherently difficult to impossible to measure (what is good science).

Moreover, you essentially make the career of some very intelligent people depend on this arbitrary metric that you have created, what do you expect to happen? Obviously they will work to that metric, which then people complain about they are gaming the system. No, they actually work toward the metric that you (not you personally obviously) have created.


Perhaps we should make their actual science the metric...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: