Jet.jl: A WIP compile time type checker for Julia

choeger · on Feb 14, 2021

Interesting approach.

Caveat:

> This technique is often called "abstract interpretation" and JET internally uses Julia's native type inference implementation, so it can analyze code as fast/correctly as Julia's code generation.

If I understand that snippet correctly, they don't prove the absence of type errors (as a classical type checker does, e.g., in Haskell) but rather interpret the code to figure out whether pieces will (or even just might?) throw an error.

This is definitely a practical approach for a highly dynamic language like Julia (hint, python guys), but it has to leave gaps. There will be undetected type errors even in checked code. Whether that matters in practice, I don't know, though.

KenoFischer · on Feb 14, 2021

This package basically exposes information that the standard Julia compiler already knows internally (extracted using some hooks I added last year to make the compiler more extensible) and uses for optimization. As such, it has quite good information about whether or not its analysis is complete, because we need that information to generate correct code. It is very possible to just error when the information is incomplete, which we do e.g. when compiling for GPU or TPU. In such a mode, the absence of type errors (or any other kind of error you wish to exclude would be proven). Of course Julia's type system is designed for semantics specification, not proofs, so it's not really what you'd want to use there. I think the most promising way to go is basically to let the users specify their own correctness criteria (for example, some people really care about correctness of array dimensions), and then just go to a full theorem proving system for that. I really like F* in this area as a design point and have played with some ideas in that direction. A package like JET.jl is of course a good step in this direction, since it can at least make use of the static information that the compiler does already need to prove for its internal optimization purposes.

dklend122 · on Feb 14, 2021

Interesting.

How do you reconcile this: "Julia's type system is designed for semantics specification, not proofs, so it's not really what you'd want to use there."

With

"think the most promising way to go is basically to let the users specify their own correctness criteria (for example, some people really care about correctness of array dimensions), and then just go to a full theorem proving system"

KenoFischer · on Feb 14, 2021

Oh, you just don't do correctness proofs over the same type system that is used for dispatch (i.e. "the Julia type system"). We already use an expanded lattice internally for reasoning about Julia code, so there's really no problem just doing something completely general.

dklend122 · on Feb 14, 2021

So you make the type lattice user extendable? Or different layers hardcoded?

If the former, how do you get code reuse?

What do you mean by expanded lattice? I thought the type lattice was fixed. What would something general look like ?

Thanks for humoring my naive question(s).

gugagore · on Feb 15, 2021

This post discussion might help: https://discourse.julialang.org/t/julia-inference-lattice-vs...

As far as I know, currently the non-julia-types type lattice is hardcoded. But even if that's the case, that's not fundamental to the design.

> If [you make the type lattice user extendable], how do you get code reuse?

What kind of code reuse do you mean? This alternative type lattice is not supposed to change the semantics of any Julia program, aside from rejecting programs that would otherwise have semantics in Julia.

oxinabox · on Feb 14, 2021

I am guessing it is the thing where if you look at `@code_typed` you will see sometimes annotated not just with types but with `Const(42)` which shows where constant folding is occuring

ghkbrew · on Feb 14, 2021

Type errors are whatever your type system defines them to be. It's perfectly reasonable to define a type error as "code that would throw an error if executed".

I think abstract interpretation is more abstract than you are thinking. It essentially means stepping through the code only keeping track of "abstract" properties of the program state. I.e. keeping track of variable types rather than values. This does mean that a function can have a different type based on how it's called, but there's an entire class of "flow sensitive" type systems with that property.

dnautics · on Feb 15, 2021

> Type errors are whatever your type system defines them to be. It's perfectly reasonable to define a type error as "code that would throw an error if executed".

Exactly this. One of my frustrations with the erlang vm is that there is not a well-defined meaning to the "anointed" typesystem defined by dialyzer when they should go with "code guaranteed not to throw when executed".

jakobnissen · on Feb 14, 2021

Yeah, that's true. I think there are three major limitations.

- First, code that is intentionally (or out of neglect) opaque to the compiler. For example, someone may simply forget to write type stable code, or someone might intentionally create a function barrier to specialize some code

- Second, Julia has no standard way of encoding constrains on types. That's good, because someone else can make up a new type you never thought about, and throw it into your functions, and if it has implemented all the right methods, it will work. But it also means you can never guarantee your code will work for any type.

- Third, Julia encodes less information in the type system - e.g. the use of result types is not common.

dklend122 · on Feb 14, 2021

To your second point, it will throw method errors at compile time

jakobnissen · on Feb 14, 2021

Yes, but if you are making a library, there is no way you can statically confirm that your function will work for all types if they conform to some interface. All you can do is test it for a few concrete types that you have. A static language allows you to do that.

dklend122 · on Feb 14, 2021

Yea. Hopefully interfaces, which have been mentioned by the core team as being desired , will help.

mhh__ · on Feb 14, 2021

Doing analysis by default also seems expensive i.e. static types can actually be quite cheap by having the confidence to say "No, this is your fault" to the programmer

eigenspace · on Feb 14, 2021

This is a really cool package. It's been known for a while now that such static anaysis is possible in Julia, but the new compiler hooks in 1.6 make this easier, and now someone has come along and gotten it done!

I hope to see editor integration of this soon.

jakobnissen · on Feb 14, 2021

This is really cool. Two wishes:

* Would it be possible to apply the analysis for just a single function in the REPL? E.g. something like `analyze(myfunction, (Int, Float64, String))`.

* Would it be possible to toggle that the profiler could error/warn on dynamic dispatch?

eigenspace · on Feb 14, 2021

For your first request, that's already implemented. It's just not in the readme for some reason. Maybe a PR would be appreciated

Here's static shape checking of arrays:

    julia> using StaticArrays, JET

    julia> x = SizedVector{3}(rand(3)); y = SizedMatrix{4, 3}(rand(4, 3));

    julia> f(x, y) = x' * (y*x)
    f (generic function with 1 method)

    julia> @report_call f(x, y)
    ═════ 1 possible error found ═════
    ┌ @ REPL[9]:1 Main.*(Main.'(x), Main.*(y, x))
    │┌ @ /home/mason/julia/usr/share/julia/stdlib/v1.6/LinearAlgebra/src/adjtrans.jl:308 LinearAlgebra.dot(Base.getproperty(u, :parent), v)
    ││┌ @ /home/mason/.julia/packages/StaticArrays/NTbHj/src/linalg.jl:189 StaticArrays.same_size(a, b)
    │││┌ @ /home/mason/.julia/packages/StaticArrays/NTbHj/src/traits.jl:186 StaticArrays._throw_size_mismatch(as...)
    ││││┌ @ /home/mason/.julia/packages/StaticArrays/NTbHj/src/traits.jl:191 StaticArrays._throw_size_mismatch(::SizedVector{3, Float64, Vector{Float64}}, ::SizedVector{4, Float64, Vector{Float64}})
    │││││ may throw: StaticArrays.throw(StaticArrays.DimensionMismatch(Base.string("Sizes ", StaticArrays.map(StaticArrays._size, Argument(2))::Any, " of input arrays do not match")::Any)::Any)
    ││││└────────────────────────────────────────────────────────────────────
    Union{}

    julia> @report_call f(x, x*x')
    No errors !
    Float64

torrance · on Feb 15, 2021

I'm confused by the corrected diff in the README... isn't that cache recreated on each function call to fib(), making the cache pointless?

celrod · on Feb 16, 2021

It is recreated on every call to `fib`, but the `_fib`s get to reuse it. Using a cache makes `_fib` O(N) to compute, so it gives a performance benefit as long as you aren't calling `fib(n)` with very small values of `n`.