Q-Transformer

maurits · on Nov 30, 2023

Being implemented as we speak, by the always impressive LucidRains [1]

[1]: https://github.com/lucidrains/q-transformer

GaggiX · on Nov 30, 2023

It's kinda funny how Lucidrains came back to make numerous commits to this repo following Q*.

itissid · on Nov 30, 2023

Apparently this guy like a bunch of others like https://github.com/ggerganov/ggml are implementing transformers from papers for people that want them. Pretty cool.

pama · on Nov 30, 2023

The additional link is indeed very useful, however, with one exception, it is limited to inference, ie code for evaluating a model using somebody else’s trained model weights. The original link has code for training (which can also be used for inference, of course.) Inference is super important, of course, so it makes a lot of sense to put effort to make it work well on all kinds of hardware; the papers typically focus on training and only then evaluating/validating a new model.

milanove · on Nov 30, 2023

I’m glad to see people in the open source community taking the initiative to implement these tools for the general public. It’s one thing to understand the high-level concepts in a paper, but another entirely to implement them efficiently. Some people just really have a knack for reading academic papers and implementing those ideas in software.

rtev · on Nov 30, 2023

Lucidrains also created the much-missed EpicMafia, which still doesn’t have a good replacement after shutting down. Exceptionally skilled person!

luke-stanley · on Nov 30, 2023

This came out a while ago, but I can see why it's been reposted. It's another approach to searching the solution space with rewards and transformer language models. Like XoT and others as I pointed out in my last comment: https://news.ycombinator.com/item?id=38471388

dartos · on Nov 30, 2023

It’s probably being reposted bc people think it has to do with Q*

brcmthrowaway · on Nov 30, 2023

It's clear we're searching for the god algorithm of AI, just like physicists are searching for theory of everything. Are transformers the answer though?

Workaccount2 · on Nov 30, 2023

It seems more likely that there will be multiple avenues to AGI, all with their strengths and weaknesses. But perhaps the "God AI" will be a multifaceted model composed of many different models acting in unison.

kstrauser · on Nov 30, 2023

I could see something like the "modularity of mind" model of human consciousness, where multiple approaches are working on "subconscious" solutions to a given problem in parallel, with a top layer deciding which is appropriate at the moment.

My human brain doesn't use the same algorithm for learning to play a song on a piano as learning to play a new board game. I'm not an AI person, but it seems reasonable to imagine we'd have different "modules" to apply as needed.

AlphaGo probably sucks at conversation. ChatGPT can't play Go. The part of my brain writing this couldn't throw a baseball. The physics engine that lets me throw a baseball couldn't write this. Is there a reason we'd want or need one specific AI approach to be universally applicable?

platz · on Nov 30, 2023

The Bicameral Mind

falcor84 · on Nov 30, 2023

Exactly, just as the true god has seven aspects [0].

[0] https://en.wikipedia.org/wiki/Themes_in_A_Song_of_Ice_and_Fi...

bee_rider · on Nov 30, 2023

On the other hand, most of the original gods were parts of polytheistic pantheons. Maybe a bunch of models that represent identities and biases could be more useful, they could argue amongst themselves, presenting a more full point of view, users could become familiar with the particular perspectives.

two_in_one · on Nov 30, 2023

Most likely there will be more then one specimen claiming to be AGI. From different groups. And they will be hard to compare.

goldenkey · on Nov 30, 2023

There's really no God algorithm needed, just something good enough to assist with research of the next tier of hardware, energy, and code for AI.

AndrewKemendo · on Nov 30, 2023

Algorithm is too simple, but yes - what does the system look like

I think the transformers architecture, or something very similar with eventually-on-policy time series forecasting in a markov decision process, is the right answer actually and was what I have been trying to make progress on for a long time[1].

[1]https://kemendo.com/research/streaminference.html

snovv_crash · on Nov 30, 2023

That plus some Monte Carlo search as in AlphaZero makes a very strong (although computationally heavy) contender for 'alive' AI.

AndrewKemendo · on Nov 30, 2023

After this past few years of progress on multivariate stream forecasting I really think it’s going to be different implementations of the same basic streaming data forecasting architecture

mikewarot · on Nov 30, 2023

We're searching for an efficient algorithm that leads to AGI. Given sufficient time and compute, I'm sure that we could get there with existing stuff, by accident, and we wouldn't realize it before moving on to the next thing... and there'd be a poor orphan AGI, lost in a Git repo, waiting for runtime.

sfink · on Nov 30, 2023

"Given sufficient time and compute" covers up a lot, though. The ultimate God AGI that would be created through that sort of process would take the form of a large room filled with a whole lot of monkeys and typewriters.

MagicMoonlight · on Dec 2, 2023

No, it’s a much simpler architecture. Anything that evolved can’t require a really complicated niche structure.

The difficulty is the scale. Every synapse of a neuron is effectively a neuron itself, and every synapse acts on the synapses around it. So before you’ve even got to the neuron as a whole you’ve already got the equivalent of thousands of neurons and logic gates. Then the final result gets passed on to thousands more neurons.

I don’t know how you would recreate such complexity in programming. It’s not just the scale, it’s the flexibility of the structure.

skinner_ · on Nov 30, 2023

There's this nice new paper ("Simplifying Transformer Blocks" https://arxiv.org/abs/2311.01906) about getting rid of all the unnecessary parts of a transformer. On their dataset, they manage to get rid of the layernorm, the skip connection, the value matrix, and the projection matrix.

lossolo · on Nov 30, 2023

As it stands, each token processed by transformer requires a constant amount of computation and energy. For an AGI system, this would imply the ability to solve problems of any complexity with a fixed amount of energy. But if this were true, it would essentially mean that P equals NP, a major theoretical breakthrough in computational complexity theory. IMO we are still missing something.

mirekrusin · on Nov 30, 2023

If they were we'd stop searching already, no?

amelius · on Nov 30, 2023

Theory of everything is supposed to be beautiful.

Theory of AI is just going to be some weird network with a shit-ton of compute power, where the latter is more important to the outcome than the former.

theptip · on Nov 30, 2023

No, not at all. Just looking for the next step of many, stacking S-curves atop each other.

amelius · on Nov 30, 2023

33% success rate for opening a drawer?

varjag · on Nov 30, 2023

That's a great start. Opening a drawer is pretty hard problem.

methodical · on Nov 30, 2023

Correct- as a reference for OC this is known as Moravec's Paradox (https://en.wikipedia.org/wiki/Moravec%27s_paradox)

pixl97 · on Nov 30, 2023

At least to me the key point of the article

>Encoded in the large, highly evolved sensory and motor portions of the human brain is a billion years of experience about the nature of the world and how to survive in it. The deliberate process we call reasoning is, I believe, the thinnest veneer of human thought, effective only because it is supported by this much older and much more powerful, though usually unconscious, sensorimotor knowledge. We are all prodigious olympians in perceptual and motor areas, so good that we make the difficult look easy. Abstract thought, though, is a new trick, perhaps less than 100 thousand years old. We have not yet mastered it. It is not all that intrinsically difficult; it just seems so when we do it.

amelius · on Nov 30, 2023

I'm not convinced. There's a lot of creative thinking going on in evolutionary anthropology, but the proofs are meager.

varjag · on Nov 30, 2023

Indeed: we have machines doing novel work in organic synthesis but none yet to empty a dishwasher.

amelius · on Nov 30, 2023

This already exists:

https://www.youtube.com/watch?v=UAG_FBZJVJ8

varjag · on Nov 30, 2023

Could you point me to the timestamp where it empties a dishwasher?

wiz21c · on Nov 30, 2023

interacting with environment is the next frontier. See AI driving...

polygamous_bat · on Nov 30, 2023

The other robot project posted here yesterday opened drawers at around 80% though https://news.ycombinator.com/item?id=38453047 And they did it in many different homes.

sfink · on Nov 30, 2023

My robot can open drawers at near 100%, but it takes the form of a stick of dynamite.

Still working on closing the drawers afterwards, though...

coolspot · on Nov 30, 2023

Perhaps you could automate rebuilding the drawers with some IKEA/TaskRabbit APIs.

amelius · on Nov 30, 2023

They should probably optimize for minimal friction force. If you're pulling the handle not entirely in the direction of movement, you're not doing it right.

33a · on Nov 30, 2023

How does this stack up to diffusion policy learning?

https://diffusion-policy.cs.columbia.edu/

Looking at the figures and videos it seems... worse? Sort of surprised they didn't compare it but I guess they're trying to limit the discussion to purely reinforcement learning methods.

EDIT: Ah, I see this is an older result and was published concurrently to the diffusion policy paper, so it's likely the authors didn't know about it in time to add the extra comparisons.

itissid · on Nov 30, 2023

Just to be clear the "scalable" part here is assuming the dataset of human demonstrations is available it can in a closed loop learn better than before? In other words, am I right in understanding that the mapping from Abstract actions to concrete steps in the State space is provided at a low "enough" cost?

itissid · on Nov 30, 2023

Like in Workflow Guided explorations like: https://arxiv.org/pdf/1802.08802.pdf, for example, if you said "Forward the email" the Bot needs to sample from a set of actions which in this case would be DOM elements and the JS API to call to click the right button to forward the email. This would be a harder/interesting to test out. Can it learn to forward an email on firefox desktop browser and then do it on an iphone?

It feels like opening a drawer once learn't, the bot has a (world?) model of what all drawers might look like, so it can open different drawers in any world. But this might not generalize to web interfaces and more specifically, how to do those actions on those interfaces?

Not to take away from what this paper's scope and achievements.

rvz · on Nov 30, 2023

How flies get onboard with the latest trend these days.

First it was the LK-99 hype, now it is Q. Chasing hype like flies to a light bulb.

thehucklecat · on Nov 30, 2023

I really wish I could understand this. I can half way get it loaded into my brain and then...

itissid · on Nov 30, 2023

Could one of the roomba companies now make those things recognize the objects and avoid them, instead of dead reckoning or RF based obstacle avoidance?

Philpax · on Nov 30, 2023

Worth noting that this came from DeepMind in Sep-Oct of this year, so it's not related to OpenAI's "unfortunate" leak [0] of Q*.

[0] https://www.theverge.com/2023/11/29/23982046/sam-altman-inte...

px43 · on Nov 30, 2023

I wouldn't go that far. Immediately after Q* leaked, Jimmy Apples pointed to this thread on Twitter : https://twitter.com/polynoamial/status/1676971508911198209

It's Noam Brown's (former Deep Mind) "I'm joining OpenAI" thread from Back in June, and he talks about how he wants to bring some of the work from AlphaGoZero etc into OpenAI's AGI architecture.

It looks like this Q-transformer work shares a lot of the same lineage of what Q* is supposed to be. This is Google's attempt at mushing Q-learning into transformer land, which is probably what Q* is as well. It's not the same thing, for sure, but they are at least sibling bodies of work.

jansan · on Nov 30, 2023

Also, it is not Amazon Q, which was presented yesterday:

https://aws.amazon.com/de/q/

3cats-in-a-coat · on Nov 30, 2023

Also it's not Q from Star Trek, or Q stars which are an exotic form of matter immediately preceding the singularity of a black hole.

GTP · on Nov 30, 2023

Also, it's not Android Q.

kristjansson · on Nov 30, 2023

Or James Bond’s Q.

behnamoh · on Nov 30, 2023

or Q.

nabakin · on Nov 30, 2023

To save you guys some time, this is a 2 month old paper and not made by OpenAI. It could be related to Q* (I don't know enough to determine that) but it's not OpenAI's Q*.

spacebacon · on Nov 30, 2023

Operation reclaim the letter Q within a reasonable objective reality.

VikingCoder · on Nov 30, 2023

What's the letter?

The letter of the day is...

Q!

wiz21c · on Nov 30, 2023

James Bond secret weapon ?

osti · on Nov 30, 2023

Is Q the new hottest letter right now?

smokel · on Nov 30, 2023

To make this strand of discussion slightly more interesting, I hereby present you a page [1] with the history of Q-tips. Apparently, the "Q" stands for Quality :)

[1] https://www.qtips.com/about/

Filligree · on Nov 30, 2023

This does seem pretty similar to Q-learning. So yes, things-based-on-Qs are hot right now.

nothrowaways · on Nov 30, 2023

Q is nothing without the *

GTP · on Nov 30, 2023

You're the Q to my *!

nothrowaways · on Dec 6, 2023

I love q so much.