Привет, раз уж неделя лиспа и ФП на ЛОРе, то задамся вопросом.
В каком языке (рантайме) время вызова FFI функций максимальное? Минимальное?
inb4: тривиальные случае C -> C, C++ -> C не интересуют.
Посто проаффилирован письмом некого чувака в рассылку racket: (время доступа из racket - 150нс, из C - 3 нс)
One of important aspects for me is efficiency of Foreign Function Interface. Unfortunately, it seems that FFI is quite slow.
Here is the code I have:
-- test.c --
void do_test(void)
{}
-- test.rkt --
#lang racket/base
(require
ffi/unsafe
ffi/unsafe/define)
(define-ffi-definer define-t (ffi-lib "libtest"))
(define-t do_test (_fun -> _void))
(define (do_benchmark)
(time (for ([i (in-range 1000000)])
(do_test)))
)
(for ([i (in-range 10)])
(do_benchmark))
-- Makefile --
libtest.so: test.o
gcc -fPIC -shared -pthread -o libtest.so test.o
test.o: test.c
gcc test.c -fPIC -shared -pthread -c -O2 -o test.o
clean:
rm -f test.o libtest.so
Running the test suggests that a call to «do_test» costs about 150 nanoseconds. I would expect something not larger than 5 nanoseconds. A test where C program calls this function shows the call costs 3 nanoseconds.
P.S Увидел, что там Matthew Flatt ответил, что срезал треть, и 5x - это текущий потолок, дальше - только лезть в jit.
I haven't particularly tried to make foreign calls go faster, so I expect there's room for improvement. A quick profile suggested an easy way to trim 1/3 of the time, so I've done that (pushed to the git repo).
In my profile, 15-20% of the time is spent in libffi's wrappers, though, so a 5x improvement is probably an upper bound on the current design --- leaving still a 10x difference between a direct C call and Racket-to-C call. To do better, we might be able to use the JIT infrastructure to generate more direct calls for simple function types, but I'm not sure how general we can make that.