> so having efficient implementations of probabilistic languages means we could ...

tlarkworthy · on July 16, 2014

general inference on a graph of dependant variables is NP-hard to do exactly. So you use a sampling based method to converge towards the right solution and stop when it looks about right. However, for many smaller problems there is an exact analytic solution.

Real problems in the wild are a mix of both types of problem, and intractable part and a part you can do precisely.

In an ideal world you would use the iterative methods ONLY for the parts that don't have an exact solution available. In practice once you have developed the general iterative solution, you might as well just use that for the easy parts too, as you are just burdening yourself with more development work and more scope for bugs by writing two inference implementations.

A language that allowed inference to be switched at no development cost it would be amazing

gliese1337 · on July 16, 2014

Aside from what other commenters already said, inference is fiddly and easy to screw up when you have to do it yourself. Errors in statistical analysis are not uncommon, even in peer-reviewed papers. Automating it is good for two reasons: first, it reduces the error rate in the kinds of analyses already being done; second, it opens up the option of doing much more complicated analyses, either with a larger number of hypotheses or with hypotheses of greater complexity, which would otherwise be ignored as infeasible.

Both of those result in the potential for greater confidence in the accuracy of conclusions reached from experimental analysis.

AlexCoventry · on July 15, 2014

It makes it cheaper to experiment with different models.

j2kun · on July 15, 2014

Still not really a statement about accuracy...

davmre · on July 16, 2014

Experimenting with more models increases the likelihood you'll find a good model. A good model is, by definition, a more accurate representation of your domain than a bad model. It will also tend to generate more accurate predictions, if that's what you care about.

As a secondary point, re-implementing inference code for each new model makes it almost certain that there are bugs in said code. So even without changing the model, automatically generated inference code is likely to have fewer bugs and thus give more accurate inferences than hand-written code. (assuming it runs to convergence; naturally there are lots of scenarios in which naively generated code will be slower to converge than something hand-tuned).

j2kun · on July 16, 2014

I don't consider bugs in code a matter of accuracy of the model. And while you can compare the accuracy levels of various models, having a compiler do the inference or you do the inference doesn't chance the accuracy. I also don't subscribe to the idea that the right model for a scientific or statistical phenomenon is a random event.