BTW can anybody share a link to a really simple explanation of the Bayes' Theorem? I've once seen one, it was a size of a twit and would let you understand it in a matter of seconds, all the "super-duper intuitive explanations" around are too huge and complex actually.
P(Hypothesis) is the prior probability of the Hypothesis being true, in other words the probability we gave to the Hypothesis before seeing any of the data we are using in the theorem. When new data is observed, we use Bayes' theorem to update our believe in the hypothesis, which in practice means multiplying our prior probability by a number that depends on how well the new data fits our hypothesis. More precisely:
evidence_factor = P(Data|Hypothesis)/P(Data)
So it is the ratio of how likely our data is if our hypothesis is true, compared to (divided by) how likely it is in general. If it is more likely to occur in our Hypothesis, our probability of it being true increases, if it is more likely in general (and thus also more likely in case our hypothesis is not true, you can prove mathematically that those two statements are the same), then our believe in the hypothesis decreases.
TLDR: Prob(Hypothesis after I have seen new data) = Prob(Hypothesis before I saw the new data) * (how likely I am to see the data if my hypothesis is true, compared to in general)
Is a rule from statistics. In Bayesian speak this usually becomes
P(prior|data) = P(data|prior)P(prior)/P(data)
or
P(prior|data) proportional to P(data|prior)P(prior)
where P(prior|data) is also called the "posterior". The main idea is that you have some idea of "the prior" and you update it with your data to "update the prior" to get the posterior.
This is probably not what you were looking for but this is it.
Where a (naive) frequentist might assume, for instance, that after a 90% accurate test comes back positive the hypothesis is likely to be true, a Bayesianist would ask how likely it was to be true in the first place; all the test did was make it ten times more likely, which may or may not make it probable.
To update the probability you assign to a prior after you make some observation you rescale it by P(observation | prior) / P(observation). This scaling factor is proportional to how well the prior predicted the observation and inversely proportional to how well the "average prior" predicted the observation.
So a prior's probability increases to the degree that it predicts an observation better than alternative priors.
You should define what kind of simplicity you're asking for: An easy to understand explanation or a simple formula. I actually think that J Pearl & Co in the Book of Why provides a good example of the former.