Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> so having efficient implementations of probabilistic languages means we could do much better at accurate analysis of data

What does a compiler-level implementation of probabilistic analysis have to do with accuracy?



general inference on a graph of dependant variables is NP-hard to do exactly. So you use a sampling based method to converge towards the right solution and stop when it looks about right. However, for many smaller problems there is an exact analytic solution.

Real problems in the wild are a mix of both types of problem, and intractable part and a part you can do precisely.

In an ideal world you would use the iterative methods ONLY for the parts that don't have an exact solution available. In practice once you have developed the general iterative solution, you might as well just use that for the easy parts too, as you are just burdening yourself with more development work and more scope for bugs by writing two inference implementations.

A language that allowed inference to be switched at no development cost it would be amazing


Aside from what other commenters already said, inference is fiddly and easy to screw up when you have to do it yourself. Errors in statistical analysis are not uncommon, even in peer-reviewed papers. Automating it is good for two reasons: first, it reduces the error rate in the kinds of analyses already being done; second, it opens up the option of doing much more complicated analyses, either with a larger number of hypotheses or with hypotheses of greater complexity, which would otherwise be ignored as infeasible.

Both of those result in the potential for greater confidence in the accuracy of conclusions reached from experimental analysis.


It makes it cheaper to experiment with different models.


Still not really a statement about accuracy...


Experimenting with more models increases the likelihood you'll find a good model. A good model is, by definition, a more accurate representation of your domain than a bad model. It will also tend to generate more accurate predictions, if that's what you care about.

As a secondary point, re-implementing inference code for each new model makes it almost certain that there are bugs in said code. So even without changing the model, automatically generated inference code is likely to have fewer bugs and thus give more accurate inferences than hand-written code. (assuming it runs to convergence; naturally there are lots of scenarios in which naively generated code will be slower to converge than something hand-tuned).


I don't consider bugs in code a matter of accuracy of the model. And while you can compare the accuracy levels of various models, having a compiler do the inference or you do the inference doesn't chance the accuracy. I also don't subscribe to the idea that the right model for a scientific or statistical phenomenon is a random event.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: