Around 2008, I read some public filings by banks. I made only two back-of-the-napkin adjustments:
1) I combined off-balance sheet assets and liabilities into the balance sheet, and
2) I changed the expected % losses to approximately that of Wells Fargo.
With those two simple adjustments, I saw that some big banks were in the hole by (combined) tens of billions of dollars.
The market prices for these banks made it clear that major investors either didn't read or didn't understand the data in the public filings.
Fun fact 1: My bank WaMu which was itself was maybe $0-$10B in the red so I immediately withdrew $2K in panic. I found that money hidden in my filing cabinet about 5 years ago.
Fun fact 2: What was even more crazy is that a famous private equity firm (I think TPG) had just dumped in billions of dollars into WaMu but that still didn't fill the hole (and why would such a big firm be less sophisticated than a nobody like me?)
Fun fact 3: FDIC had ~$50B at the time, which maybe wouldn't cover the losses at the pessimistic end of my estimates.
Fun fact 4: In accounting classes, I identified 3 very shady areas of accounting: options, off-balance sheet entities, and pensions. I believe the first two have since been fixed but I think pension accounting is still very shady so... beware.
Disclaimer: I studied finance and accounting in school and read 10-Ks in my spare time in the early 2000s so I have some knowledge of accounting shenanigans. I also had some experience in real estate. But that's about all the expertise it took.
> Around 2008, I read some public filings by banks. I made only two back-of-the-napkin adjustments: 1) I combined off-balance sheet assets and liabilities into the balance sheet, and 2) I changed the expected % losses to approximately that of Wells Fargo.
With those two simple adjustments, I saw that some big banks were in the hole by (combined) tens of billions of dollars.
The market prices for these banks made it clear that major investors either didn't read or didn't understand the data in the public filings.
That's impressive work, but I interpret that a little differently. There was definitely irrational exuberance going on, but plenty of major investors understood what was happening well before 2008. The reason the valuations were still out of whack is because having the correct data and the correct analysis isn't enough, you have to correctly forecast how the market will react. So there was an unfortunate feedback loop: many investors savvy enough to see the problem were not betting against it because there are far easier and more consistent ways of trading profitably.
This is why the most successful funds don't really try to replicate the process you're talking about. Their process is data and hypothesis agnostic. A lot of their work happens to align with the sort of analysis you've described here, but they don't start from the same place.
Eh. It works if you're a bear. You get in early, you sell in 2006 while shaking your head. Maybe you leave a bit in with extra exposure to volatility so you win either way, maybe not. Then when 2009 hits you pile in again. It's pretty easy to make way above average returns on the stock market. Just look for the classic signs that it is peaking (low unemployment, high P/E, high leverage, bad demographic trends, etc).
1) if you are a fund you just cant' sit out 3 years. Your fund will shut down as everyone will yank their money,. if its your own money then people will leave as you won't pay bonuses.
2) If you went short in 2006 then you wouldn't have survived until teh crash of late 2008.
> It's pretty easy to make way above average returns on the stock market.
This is just an absurd statement along the lines of its easy to build a billion dollar company just mimic what all the other billion dollar companies do.
Sorry for being so negative to your comment, but come on. That statement can't be seriously defended in any resonable manner.
If you are Berkshire or Amazon, you can sit on cash or potential earnings for years, and investors can trust you with it. The keys are finding the right investors and demonstrably earning their trust.
The limitations of fund structuring aren’t my concern. I’ve been investing for decades and it’s trivial to beat the market if you’re patient and knowledgeable about tech.
The fact that funds do worse than indexed etfs says more about fund incentives and investor savviness than it does about the effectiveness of this strategy.
If there were less political intervention into financial markets then solid financial analysis would win almost every time. Maybe that's the way the world should be.
Since the GFC a lot of macro bets in both US and EU have been bets on political will and central bank actions. In late 2010 Bank of America was technically insolvent (based on analysis similar to yours), but the Fed went to work and backstopped the market in a multitude of ways and the valuation soared. In 2012 I would have bet the farm that Greece was headed into bankruptcy, but the Greek people didn't revolt (as much as anticipated) and so the ECB effectively nationalized most Greek debt across the rest of the Eurozone. I'm no expert but I try to apply these lessons to all new modern crisis I hear about. Underfunded pensions will destroy certain states, or social security will be broke and unable to pay out benefits? Sure; we'll see.
My understanding is that if it weren't for TARP several more publicly traded investment banks would have collapsed not to mention wider collateral damage in the market.
Most agree that TARP was necessary, but QE2, QE3, and very low interest rates appear to have done little more than inflate asset markets and increase debt. They are trying to unwind that now, but falling back to the mean with this much excess is not likely to go well. We left the cast on way, way too long.
What's done is done. It'll prove out either way in the next few years.
Pension account is very hard to do properly because pricing the liabilities is, well, entirely dependent on your perspective. In the UK, at least, we might consider several different measures of what that liability number is (and what the deficit is under those measures).
Buyout => What would it cost you to pay an insurer to take the liability off your hands (discount rate will basically be the replicating portfolio of government bonds).
Accounting => Depending on standard, the discount rate might just be the expected return on your portfolio
On a 30 year duration liability that could be a difference of over 200% between the different measures. Which is realistically right though?
I'm starting a career in accounting and would appreciate a recommendation or two on what outside-of-school resources you used to better your understanding. Recommendations?
Sorry, but I didn't actually make a career out of accounting (I only wanted the knowledge to understand investments and maybe as a backup career plan). I don't think I can give meaningful advice on this.
Yeah, I assumed that some large investors (e.g. hedge funds) would have someone skilled in accounting shenanigans and would make big bets to move the market to reflect the information that was publicly available.
Every single public company is well versed in accounting shenanigans, often to the tune that you almost can't figure out anymore whether someone is making a profit or not.
Also, an accounting loss isn't always a "real" loss. For example, with the recent tax overhaul, some companies booked unexpected losses because they had significant carry forward losses. Those were reduced in value because the corporate tax rate went down. But essentially nothing changed -- yes the tax base and book value of these losses were adjusted, but no cash was lost.
They can also push revenue or profits back or pull it forward. It really is a jungle out there.
Don't feel bad man, I remember there being a run on food staples. Things like 50lb bags of rice and beans were sold out at Costco. People were stocking up on guns and ammo preparing for fucking Mad Max. I am not ashamed to admit I participated in some of that myself by doubling up the emergency rations I keep for hurricane season. You withdrawing $2k was not irrational at all given the scale of what was going on.
Are there any possible holes in the making you see now? I was curious for a while if crypto was going to pose a systemic risk, but the total market cap[0] was never really high enough.
I do know that
1) a payment system with transactions that take more than a few seconds and cost more than a few pennies is not a good payment system for most things.
2) I've never heard anyone talking about using crypto for anything but speculation or paying for ransomware. Even on the internet, I've mostly heard of speculation, ransomware prostitution and buying drugs.
3) Regulators don't like payment systems for ransomware, prostitution and buying drugs.
4) Assets whose prices are based on speculation eventually fall.
I could be missing something. As I said, I don't know much about crypto-currencies. Please nobody bet on my comment alone. Also, timing collapses is difficult. Bubbles can last for more than a decade.
Before the price can drop like that, it needs to become a systemic risk first. Meaning I think it will first still go up significantly before coming down.
I have no economics background (well I guess I took a macroeconomics class once), but Martin Shkrelli (yes that Martin Shkrelli) has a pretty great playlist on youtube [1] that goes through exactly how to do fundamentals analysis. Warning: the videos are long and dense.
Sorry, I'm not sure I can give good advice on this because I got a bachelor's degree which is probably way more than needed. There's probably a more efficient way.
Not OP but Tesla's stock is certainly overvalued(Musk said so himself). Their future depends on if they can scale without QA issues - so far even producing a few hundred model 3s a week they seem to be struggling with defects and problems in the delivered cars.
The one distinguishing factor is that they have terrific brand value because of Musk, and this could mean that consumers would want their cars despite all the troubles they have. I think this unusual level of brand value is the confounding factor that muddies predictions and is unquantifiable. Brand value makes consumers make non-rational decisions. If it was any other car company with Tesla's numbers you'd be crazy to bet on them succeeding.
People who hear that say, Lexus are doing badly and have severe problems with their cars are going to go buy an Audi. People who want a Tesla so far are just waiting for a Tesla.
>People who hear that say, Lexus are doing badly and have severe problems with their cars are going to go buy an Audi.
If you're buying something where a large part of the value is in the image/brand it's usually not fashionable to complain about it.
When a S10 breaks it's GM's fault. When a Tacoma breaks the owner didn't treat it right.
When a Kia breaks it's unreliable. When a BMW breaks that's just part of owning a BMW.
Tesla has cultivated an image (or rabid bunch of fanboys, depending on your perspective) where you're expected to put up with shoddy build quality and less than prompt service as part of the ownership experience so Tesla gets a pass on those things.
Even if they succeed on the m3 they are overvalued. I have bought and sold it several times. I think they'll succeed, hard to see if they will justify their stock price because other car companies can slowly build reasonable evs. look at the bolt - it's a credible car. It apparently loses a lot of money, thats whey don't make enough to meet market demand (unlike other gm cars).
They're spending a LOT of cash but the big thing that stands out is most of the cash outlays is for PPE (property/plant/equipment). At the rate they're spending, they have about a year's runway and need to keep raising money or stop spending (or have a HUGE increase in revenue).
I'm not sure what they're investing in and how much more they have to invest so I don't have an opinion.
A friend and I used to run a website that tracked activist short sellers and their campaigns, and published all that information as a nice centralized database basically.
Hedge funds were, by far, the most interested in this - which was surprising to us, our original target audiences were auditors and legal firms. My impression from this experience is that the more successful hedge funds are just mass-ingesting data from everywhere they can, and somehow trying to make sense of it.
It's totally an arms race too: Once one of them is ingesting data from a new source, everyone else has to start doing it as well lest they fall behind. It was significantly easier to convince new clients once we had a few clients already (very much like the FOMO people speak about when fundraising).
Did you keep up with that project or move on to something else?
It's humorous to me that most people who end up selling data to hedge funds more or less "fall" into that industry by accident. Your story is exactly how it typically happens: you develop a new data-enabled product for one market, then a bunch of hedge funds find it and realize it's useful. Most companies pivot from their original market once they realize how lucrative selling data can be.
We moved on - my friend was in business school at the time, and thought that continuing on with the project would be detrimental to his job search (he was looking at joining a hedge fund).
Yeah, it was a very sudden shift. We always knew they were potential customers, we just didn't realize how quickly they would take to the product (as opposed to legal firms which you can imagine are significantly more bureaucratic about such things).
Anything I've ever wanted (within a certain cost) has always been signed off immediately/practically did not require approval. It's very easy to sell a relatively cheap (compared to some of the other subscriptions sold to hedge funds) service to an organization that's extremely flat.
I remember your website. I was at a hedge funds that used your service. I had a few conversations by email and by phone with your partner too (nice guy and went to a good MBA school if I remember). I understand why there were maybe bigger and better opportunities vs return on time needed to keep doing your activist short site going as it is quite niche. Are you involved in the space at all any more?
For others reading here and in case curious, the hedge fund I was working at wasn't interested in paying for access to tracking of activist short campaign data for the short trade ideas themselves. There is little interest in copying or jumping onto another's trade like that. The real interest the hedge fund wants to get alerts and track these campaigns was to keep an eye on what other hedge funds are doing - it was more about competitor and industry colleague research than securities research.
Automated data ingestion doesn't come for free. It's an ongoing effort to keep on top of new sources and schema changes, and the amount of effort scales with the number of sources.
I'd guess that's the value you were providing for them.
Oh sure, no doubt. It makes sense to spend $10 a month on something if it saves you 10 hours of time and you can potentially make $10 million from it.
I was mostly just surprised at how quick the process was. Hedge funds are much more afraid of missing out on something than they are of paying a monthly fee to someone.
That might sound a bit silly (what business isn't afraid of missing out on something?), but that's definitely NOT the case in many other enterprise sales discussions.
Did they scrape indiscriminately, or did they pay for access? I'm hoping it was the latter so you were able to keep the site self-sustaining for a while longer.
The info was gated behind a paid login, so thankfully we didn't have to worry too much about scrapers. We also set up a notification system when new campaigns came out, so they didn't have to long poll for new info.
We actually did have quite a few people scraping our homepage for some reason (which was rarely, if ever, updated) but they were mostly other aggregators as well, not potential clients.
How did you do pricing, if I might ask? Did you calculate the expected benefit of the data to funds, look at prices of similar data offerings, or etc.?
Basically, anyone who launched a public campaign qualifies. If you're just short selling without any related posts about why you're short, etc. you're not an activist short seller.
It's ambiguous, no doubt, but actually pretty easy to discern in practice. There are a few big names that dominate the landscape, as well (e.g. Citron).
To be fair, that's really only surprising to people who believe the strong EMH proposition is true. EMH is demonstrably false if you casually look at the returns of various funds over timespans measured in decades. The only way you can argue it isn't false is if you bend over backwards to redefine these firms as havens for insider trading or try to claim they're plausible as mere statistical outliers.
Otherwise, the weaker postulate of EMH is sort of redundant. That's not to discredit Fama's work: we know far more now then we did back then, and it's legitimately useful as a theory of the relationship between market state and information asymmetry. But its utility has nothing to do with whether or not the market actually is efficient.
So they have an advantage against other hedge funds, but . . . most hedge funds underperform the market. So all you're doing by doing research is closing the deficit.
I personally know and have worked with a few people who manage hedge funds (with a combined hundreds of millions in AUM). So fun fact: one of the tight-lipped secrets about the hedge fund industry (which could arguably also be said for a lot of the financial industry) - is that none of them actually know what they're doing.
Characters like Ackman, Shkreli, etc all sell the idea that they know what they're doing, when the reality is it's simply gambling with other people's money with minimal risk. Luck will favor some, and others...not so much. The entire industry is essentially a collective delusion/sham.
This is certainly true of most stock-picker type hedge funds, but there are legitimate strategies that aren't a shirty replication of an index fund. Unfortunately, stock-pickers have somehow become most of the industry seem to be most of the industry (probably because there's no real talent or capital constraints on them).
That is what he say he does pretty much all day, he believes in extreme hands off approach so that he does not have to waste time with the day today operations of running a conglomerate.
that's basically what any analyst or researcher does, right. then, I guess an analyst writes a simulation to test his theories & a report. these hot shots likely just have other people their theories, indeed, but still likely write some amount of reports
Well couldn't it have been the opposite? Hypothesis: maybe this kind of research has the least marginal return, therefore only the most profitable funds are willing (or able) to invest in it (Bridgewater would invest $10M for 0.1% of improvement while a small fund can't afford that luxury given that 0.1% is fund size independent).
I don't know, I would think that the cost of finding good investment opportunities would increase as a fund gets bigger assuming all the funds have enough money to operate well. An opportunity to invest 20 million dollars with an expected profit of 50% would be a lot better for a fund managing $300 million, but not as exciting for a fund managing $3 billion.
The only time I hosted a booth at a trade show, I got a few people who sounded interested in our product at first but who quickly tangented off into more general questions.
As one of my coworkers explained, there are investors of varying caliber wandering around and they often aren’t interested in your company. They’re getting the gestalt from the show floor by picking any brain that will give them the time of day.
> "The fact that public information acquisition relates to performance is surprising. SEC filings are the very definition of 'public' information, and therefore, usage of such information should not be profitable,"
Honest question: isn't that a contradiction in terms? Public information stands for "stuff that everyone is presumed to know", right? Like, economists using an assumption of "perfect knowledge of the market regarding public information" (or something like that, I'm obviously not an economist) as an approximation in their knowledge. But the fact that some managers do look this up, and some don't, means that this approximation does not hold. The premise of what is implied with "public knowledge" is undermined by the very thing being measured.
Assuming I understand this correctly, it looks to me like they essentially correct for the error caused by this assumption of perfect knowledge, which is the opposite of unexpected (but still a very neat finding!)
It's not a contradiction. A better, more formal definition of "public" would be, "information which is sufficiently accessible that its price impact has diffused through the market." In the context of capital markets, "public" doesn't mean that everyone knows it, it means that it's essentially accessible to anyone (in a reasonably similar slice of time). As a corollary, it also means that public data has more or less exhausted its (individual) impact on the market.
Most "public" information is actually woefully underutilized. The only real requirement for data to be public is that everyone could know it, not that everyone does know it.
> "information which is sufficiently accessible that its price impact has diffused through the market."
Thank you (and Retric) for explaining, but this phrasing does not quite convince me: it does not explain how this diffusion is supposed to take place, and I for one cannot imagine a way of this knowledge to be distributed among the market as a whole without individuals knowing this information and freely sharing it together. Which again seems to be fundamentally undermined by what was measured.
I mean, yes, there are situations in which you can distribute knowledge among people where no individual can see the whole picture, but together they manage (for example, the knowledge required to turn crude oil into plastic requires knowledge among every step of production, diffused among individuals - nobody truly "knows" how to turn crude oil into plastic if you define that as knowing everything involved in the process). I do not see that kind of context apply here though.
The price of a security is, formally speaking, a (weighted) sum of many disparate data points related to the security. The more data is available about the security, the more efficiently it's priced. Individuals do not need to explicitly share information about any given security to each other for it to spread through the market, because there exists a feedback cycle between any security and the publicly available data related to it. The price is literally a function of that information.
In other words, every time a security is traded, its price is updated with a very small amount of new information. It's not important that any single individual has all the information, it's important that the market has all the information. That is the essential function of the market - price discovery. No single investor (or even a team) could hope to accurately encapsulate everything there is to know about a given security. The market "prices" new information into a security, which is precisely what it means for data to be diffused.
Stated another way, novel information about a security represents a potential price inefficiency. But that price inefficiency is essentially lost once the data has become public.
> it's important that the market has all the information.
I'll be honest with you: when I read this my first response was "what does this even mean? What is "the market" and how does it "have information"?" I guess the sentences preceding it are supposed to explain the structure that the market has built up to automagically capture this information, but I don't quite follow.
I'm not criticising your comments, I'm very grateful that you are trying to explain how this works and the logic behind it, but so far it's black-box abstractions all the way down.
A current price aka price at last trade of 10$ and nobody willing to sell for less than 10$ and nobody willing to buy for more than 9.99$.
A new seller shows up, if they are willing to wait they might get more than 9.99$ or they can accept 9.99$.
Or a new buyer shows up they can buy for 10$ or wait for a lower price.
Now, without effort people can just look at the history of trades and see the price rise or fall. This is independent of whatever reason that causes more people to suddenly want to buy or sell, just the fact that people are buying or selling in it's self moves the price.
Ok, so half-guessing at what you meant, wouldn't this be the crux of the problem then:
> In other words, every time a security is traded, its price is updated with a very small amount of new information.
The update happens after trading, so before that the price is by definition outdated, meaning it has that bit of "inefficiency" you spoke about. Hence "doing your homework" as a hedge-fund manager would in theory allow you to spot this inefficiency and make better trades.
Yes, that's mostly correct. But I'm not sure what your question is. It's not enough to have the data, you also have to have reasonable confidence in how the price will change once the information does impact the market. There's practically an eternity of information available, but you can't use all of it. You have to pick and choose. Moreover, you have to choose how much to weight any particular set of data with respect to another.
This is not to say that most hedge funds are right to ignore any given set of meaningful data. It just means that they have to make decisions about which data to use and how much they'll prioritize it. Then they have to see if they have the requisite infrastructure to trade on that information before anyone else.
The assumption is less that every individual has perfect information, rather that the market as a whole has that information. For example arbitration is assumed to 'work' however it's also assumed the returns will be driven down over time.
Wouldn't some sophisticated shops potentially use crawlers, etc. hosted from cloud providers (i.e., third-party IPs) that might significantly skew this data?
Yes, Renaissance runs everything on its own infrastructure (as do most firms of that caliber, and even a tier below).
That said, many of these firms also ingest data from aggregated sources, so their research would activity be very difficult to accurately track via IP presence.
Why? The reasons for having their own datacenters are either legacy or needing some sort of specialized hardware that is not available at a cloud provider. Mana which is a new pure quant fund uses AWS extensively.
If it is discovered that Amazon is stealing data from AWS customers than AWS will go out of business fairly quickly. Also even though Mana is new they do have ~$1B AUM.
Snapchat runs on Google. Dropbox ran on aws for a long time. Instagram was on aws before Facebook bought them. Netflix has basically everything but the actual video content on AWS.
In my mind, if technology is a big part of your business then running on other people's computers only has one upside: might be cheaper. Everything else is downsides.
Therefore, A) companies that operate at a scale where the only upside disappears, and B) companies where the cloud saving outweights their expected liability cost, both will have their own datacenters. Facebook is clearly in category A, RenTech is clearly in category B. This is my understanding anyway.
Public cloud is not cheap. Public cloud is really expensive. The value of public cloud is that it's flexible (my current employer, which is massive, uses AWS in part because it has poor governance of leased data centres) and has hosted offerings that save time and operational overhead - it's quicker and easier to use SQS and DynamoDB than to host Kafka and Cassandra. Cheap it is not, however.
Take this with a grain of salt as I don't work at any of the funds mentioned (but still in the industry), but why not?
Filings are subject to change due to errors, updates, etc. It's cheaper, easier, and less error prone to just scrape and overwrite data than search for any updates.
I may have been heavyhanded in saying that the firms are rescraping the entire universe of filings on a daily basis, but I guess my point here is that these numbers can very easily be skewed and are generally a pretty poor indicator of actual data access.
SEC EDGAR is terrible for a variety of reasons. It breaks, few things are truly standardized/normalized, etc. If you have the time and an invested interest there's no reason not to.
Filings are subject to change but they are restated, not modified in place. That said, anything goes on the site.
The only real reason not to scrape everything on a continuous loop is that there is a rate limiter in place, so you may end up impeding your intake of new data by loading yourself down with old jobs to handle.
I tried to get bulk data from EDGAR a while back. Turns out bulk data acquisition has been turfed to a third party, which charges for downloads. This is supposed to be federal free data, I was so pissed off. I am still pissed off.
Huh? It takes time to download all the filings by parsing the index for free [1] but it can be done. You also have to add back off logic when you get a "too many requests" error, but it isn't that hard. I did it and I am still updating as new filings are posted.
>"Perhaps most surprising, the researchers found the median fund-month download amount is only four filings while the mean was 672, suggesting that relatively few funds are accessing vastly more information."
It's not good enough to be the best hedge fund manager. You need to make more profit that investing in a boring index fund, including the additional overhead of moving money around. And the research seems to show that most fund managers do not.
Two things I can gather from this if it happened to hold true across the hedge fund universe:
1. Large shops employ a lot of fundamental analysts and larger funds only get large through outperformance.
2. Less on the fundamental side, funds regardless of size that have automated retrieval (and likely processing of public filings) probably have a higher than average disciplined investment process which leads to outperformance.
You can try my site Last10K.com which uses sentiment analysis to find & highlight positive & negative remarks in lengthy 10K/Q filings. Here's an example from Burlington Stores' 10K this week where we found 60 positive and 15 negative remarks by their management team:
The more interesting story is that if you can get this data (MITM or some other means), you could front run a fund by figuring out who they’re researching.
No, this is surprising because the only research they measured is accessing SEC filings from the government’s website. Since this is the definition of public information, in an efficient market there should be no comparative advantage to having it because every manager should already be using it.
The market isn't efficient. But more importantly, SEC filings are not the only data these firms use. It's just one source for which these researchers were able to capture access information.
Comparatively speaking, the SEC filings are just a blip next the rest of the data gathered by quantitative funds. Any conclusion that could be drawn from what's contained in this report would be hopelessly misleading.
"As discussed in the introduction, public information may be profitable if sophisticated investors like hedge funds are skilled information processors. Alternatively, private information may be more valuable when used in conjunction with public information." (pg. 19)
The paper goes on to show evidence suggesting that the predominant channel is the latter complementary private information mechanism.
It does seem counterintuitive that publicly-available information does indeed generate alpha, but the real valuable insight here is that hedge funds are able to generate alpha from using public data synergistically with private data.
> It does seem counterintuitive that publicly-available information does indeed generate alpha, but the real valuable insight here is that hedge funds are able to generate alpha from using public data synergistically with private data.
Even more than that: the best funds can generate alpha by simply combining different sources of public data without necessarily using any nonpublic data.
There is a lot of debate between efficient and inefficient market theories. Most hedge funds would ascribe to the inefficient market theory and so it is not surprising they are eating up public data.
This doesn't run afoul of my understanding of the EMH. I would expect an equilibrium for the stock market to be when the amount of trading by well informed individuals is such that the marginal cost of another hour of research is equal to the marginal benefit of another hour of research. Not: when the marginal benefit of another hour of research is zero.
The headline is terrible. The salient point from the article is:
> By mapping hedge fund IP addresses to those accessing financial filings, the team identified information gathering by hedge funds such as Renaissance Technologies and AQR
What is interesting for me is that there is a possible correlation between people who mine more and more data - even from publicly available sources - and their returns.
Important to note that correlation is not causation
Also interesting to see that the SEC would give out IP address information of who accessed their filings.
If that IP data is from the SEC 4 filings than that is interesting to see Renaissance and Blackrock using the bulk of their downloads from that source.
BUMP Form 4 is insider trading disclosures per the original paper. That is even more interesting.
it's like this: if you were in a super performant class where everyone read all the teacher's material supposedly until they knew it perfectly, then you would expect that there would be no correlation between the number of times the public class into is read and performance, as you expect everyone to be really performant and read it enough times. You would expect that for such super performant students, the only relevant info would be what they get elsewhere, because that's less obvious than studying the class material, which everyone is expected to already know perfectly.
Turns out maybe the students aren't actually that good at knowing if they read the material sufficiently well, and you can get information about who will do better at the test just by looking at who looked at the class material the most.
The surprise is that the students are worse then we thought.
Suppose I studied for the SATs more than anyone by conjugating Japanese verbs for 16 hours a day. Would this help my scores? Surely the relevance of information is more important to investment returns than the quantity.
I don't think they can really draw this conclusion. There are a great number of additional factors at play and accessing public sec records is a bit of a red herring. Especially considering that the funds they have listed as examples are primarily large high frequency algo funds (Renaissance and AQR) and are even less likely to find much utility in intermittent public filings. This may however be an indicator of diligence, but trying to link public information to a big increase in performance is somewhat far fetched.
AQR are certainly not a high frequency trading firm, and Renaissance may do some high frequency trading, but they are not primarily a high frequency trading firm. Where did you get that idea?
The paper explicitly addresses the point that large-scale systematic collection of public records may be indicative of the kind of fund that outperforms, rather than an indication that the public records add alpha in and of themselves - it's in the abstract
> The effect is not due to differences in fund type as the results hold within-fund.
I think you need a more convincing argument than just saying "There are a great number of additional factors at play" and "Trying to link public information to an increase in performance is far fetched".
Most "black box" quantitative hedge funds still use massive amounts of public data, even public data that might appear somewhat obvious. They absolutely develop new insights from that data.
Yeah, to clarify I was agreeing that those firms certainly pull and use that data, but not necessarily in the same traditional fashion that paper seems to imply.
I understand he does, but our definitions are different. Firstly, they are not a buy-and-hold sort of fund. Their strategy is more momentum based, and in my experience, to be good at that you need a deep understanding of HFT in current market conditions. I'm not saying they're market makers, but I will say that they employ some sub-minute strategies - to me, that is high frequency (not super high frequency, but high frequency nonetheless).
To be clear, when you say AQR is primarily a "high frequency" firm, do you mean they are mostly position neutral with execution in (at most) a few microseconds?
Do you know how these algo funds keep a competitive edge for so many years (e.g. Renaissance 30+ years)? I am trying to understand their "kind" of product innovation as in what are the biggest factors they are continuously trying to improve?
Renaissance Technologies has completely automated the process of signal discovery.[1] They don't hire researchers to manually derive novel insights or trading models from data, and they don't really bother with exclusive sources of data. Instead, they hire researchers to improve methods for automatically processing vast amounts of arbitrary data and extracting profitable trading signals from it.
When most funds say they're "quantitative", what they really mean is that they use huge amounts of data to inform fundamentally manual trading strategies (this includes most of the places widely considered to be "top" firms). They develop trading algorithms, and those trading algorithms are often successful. But the algorithms are developed manually and then deployed. Their researchers and engineers actively seek out new sources of data and try to compete on novel sources of untapped information. But the reality of what happens is that they simply drown in the data. They can't clean it or process it nearly fast enough to maintain long term trading strategies, nor can they even begin to find a way to automate the trading strategy extraction. If you're working with hundreds of terabytes of data, you cannot selectively formulate hypotheses and test them. It's far too slow. You will find dramatically fewer novel insights than a fully automated process.
In other words, they're a step above traditional "fundamental" hedge funds, but they focus on the wrong problem (but not for lack of trying!). In contrast, the truly successful quant funds have automated the data processing and feature extraction pipeline end to end. The data is a pure abstraction to them. They don't bother with forming hypotheses and trying to find data to test them, they allow their algorithms to actively discover new correlations from the ground up. So many quantitative funds advertise how much data they work with, and how they have all these exotic sources of data at their disposal...but the data does not matter. The models for the data do not matter. The mathematics of efficiently processing that data are what matters.
As a result of their consistent profitability, most of the jobs you see listed for the really successful funds (if they have a website) are not "real" in the strictest sense of the word. You can apply to them, but they only keep active careers pages to attract the best researchers. Their only incentive to hire is to 1) keep someone who is actually exceptional from joining a competitor or 2) keep an academic researcher from re-discovering their work when they seems like they're getting close to it. This is why they primarily focus on quantitative PhDs in information theory, high energy physics and computational mathematics (especially information geometry).
To be completely frank, Renaissance is an outlier, but not just because of their returns. They're an outlier because of how public they are. Most of the funds with comparable returns not only don't take any outside investor capital, they only have 25 - 50 employees. They virtually never hire because they don't have to. If your work is fundamentally interesting, novel and applicable to what they're doing (even if you can't immediately see why), they will call you.
___________________________
1. Other more secretive (but equally successful) funds have done this, but they are much more under the radar.
As someone who has worked for one such secretive hedge fund in the past, and has for a long time been very interested in Renaissance, I'd be curious to know what your source of this information is. Any chance you could share some more insight?
1. I'm friends with multiple people who used to work at Renaissance, and I've directly spoken with folks who are currently there (among other, similar firms).
2. I've read the research published by professors and post-docs before they were hired.
3. I have first hand experience developing forecasts for various market research firms and many hedge funds. I've seen first hand what the difference is between the firms that say they're quantitative and the ones that are quantitative.
Not to discourage you (and you probably already know this) but: since you've already worked in the industry, Renaissance is very unlikely to take you seriously as a candidate. I'm not sure if that's what you meant when you said you're interested, but I figured I'd put it out there.
In any case, contrary to popular belief it's possible to connect the dots on what firms like RenTec do. But just having a high level idea of how they work isn't nearly enough to replicate their success. Their success relies on a large number of interdisciplinary scientists cooperating with the support of an incredibly specialized infrastructure. Getting the cliff notes on how they achieve e.g. dimensionality reduction doesn't come close to cutting it.
Would things like “dimensionality reduction” and other cutting edge ML techniques at least help? As in, help you become a better “manual” quant researcher or help you develop shitty (relative to RenTech), but still profitable strategies?
Do you have any other keywords? “Learning from few examples” comes to mind...
There is also something else: some funds have become so good in putting the right orders in the book that they don't need statistics so much to generate alpha.
Do you mean G Research? Not sure that one really qualifies as “secretive”, they’ve been recruiting from my Masters programme at Oxford years ago (albeit under a different name).
They automated signal discovery in what kind of data? I have heard their use of unconventional data sources is the source of their success, not automated signal discovery.
No, every quantitative firm uses unconventional sources of data. That doesn't meaningfully differentiate them (at least, not anymore). For example, Two Sigma has an entire division devoted to sourcing and processing "alternative data." But Two Sigma is not at all comparable to firms like RenTec.
The funds I'm talking about (including RenTec) take in as much unstructured data as they can possibly find, almost indiscriminately, and they tune their processing pipeline to the point that it requires neither manual classification nor munging. In most cases, a trading strategy is sufficiently multidimensional that any particular set of data can be completely public. Exclusive data is helpful, but not required. In many cases people become too dependent on exclusive data and lose sight of the methodology.
This is precisely what I mean: many people think that these firms differentiate based on the sources of data they use. They do not. They differentiate on their ability to automatically extract signals hiding in plain sight. Whether or not the data is public makes very little difference, because the signals come from tens of thousands of indicators combined together.
You mentioned these firms are typically smaller in headcount, use private capital, and generate similar returns. Any insight on how they're targeting asset capacity for their meta-strategy approach? The medallion fund, until recently was capped around 3-4 billion [1]. I wonder, if the quality and quantity of data itself, and the models built on that data, are subordinate to the data processing IP as you say, can this type of trading be done at smaller scale, either as an individual or handful of people? I've been out of the industry for a bit, but infrastructure and data costs were the main (financial) hurdles for startups, particularly in the high frequency space.
Are there any firms outside of finance using an analogous approach to data processing and signal construction?
No they can't. I think they have order of magnitude of 10bn of their own money in their good strategies (Medallion et al), and another 10bn-ish of external money in their bad/stale strategies (which are still among the better hedge funds and can charge 1%+10).
But reputedly they neither allow external money into Medallion nor do they let employees compound their existing stake in the fund but pay out ginormous 20%-40% annual dividends exactly because the strategy is not scalable.
Tradeworx is HFT. Yes, the premise of this paper is completely absurd. Drawing a line between overnight batch jobs for internal databases and performance is a quantum leap to say the least.
With those two simple adjustments, I saw that some big banks were in the hole by (combined) tens of billions of dollars.
The market prices for these banks made it clear that major investors either didn't read or didn't understand the data in the public filings.
Fun fact 1: My bank WaMu which was itself was maybe $0-$10B in the red so I immediately withdrew $2K in panic. I found that money hidden in my filing cabinet about 5 years ago.
Fun fact 2: What was even more crazy is that a famous private equity firm (I think TPG) had just dumped in billions of dollars into WaMu but that still didn't fill the hole (and why would such a big firm be less sophisticated than a nobody like me?)
Fun fact 3: FDIC had ~$50B at the time, which maybe wouldn't cover the losses at the pessimistic end of my estimates.
Fun fact 4: In accounting classes, I identified 3 very shady areas of accounting: options, off-balance sheet entities, and pensions. I believe the first two have since been fixed but I think pension accounting is still very shady so... beware.
Disclaimer: I studied finance and accounting in school and read 10-Ks in my spare time in the early 2000s so I have some knowledge of accounting shenanigans. I also had some experience in real estate. But that's about all the expertise it took.
(Several edits made to improve readability.)