看到有同学说lua库很少,需要自己造轮子。事实上,情况并非如此。今天,我将向您展示一个魔术。车轮之痛。这个魔法就是FFI(Foreignfunctioninterface)。我不打算详细解释FFI的原理,所以简单来说,FFI实现了一个跨语言的二进制接口。其优点是效率高、方便。直接调用ABI缺点很明显,如果有问题会直接挂掉,所以越过临界区前仔细检查数据就可以了。今天我们就直接找一个C语言的库,然后在luajit中使用FFI调用这个函数库来给大家演示一下。什么?这里有一个高性能的base64库?让我们以这个repo为例:https://github.com/aklomp/base64。这是一个用C语言编写的Base64编码/解码库,支持SIMD。你可以简单地运行这个库的基准测试:karminski@router02:/data/works/base64$makeclean&&SSSE3_CFLAGS=-mssse3AVX2_CFLAGS=-mavx2make&&make-Ctest...Testingwithbuffersize100KB,fastestof10*100AVX2encode12718.47MB/secAVX2decode14542.81plain3decode3MB/secAVX2decode4plain3decode3MB/sec23MB/secSSSE3encode7269.55MB/secSSSE3decode8173.10MB/sec...我的CPU是Intel(R)Xeon(R)CPUE3-1246v3@3.50GHz,可以看出如果CPU支持AVX2,可以达到更多超过12GB/s,非常给力,连普通的SSD都跟不上。我们需要的第一步是将这个repo编译成一个动态库。但是这个repo没有提供动态库编译选项,所以我们神奇的改了这个项目的Makefile.CFLAGS+=-std=c99-O3-Wall-Wextra-pedantic#SetOBJCOPYifnotdefinedbyenvironment:OBJCOPY?=objcopyOBJS=\lib/arch/avx2/codec.o\lib/arch/generic/codec.o\lib/arch/neon32/codec.o\lib/arch/neon64/codec.o\lib/arch/ssse3/codec.o\lib/arch/sse41/codec.o\lib/arch/sse42/codec.o\lib/arch/avx/codec.o\lib/lib.o\lib/codec_choose.o\lib/tables/tables.oSOOBJS=\lib/arch/avx2/codec.so\lib/arch/generic/codec.so\lib/arch/neon32/codec.so\lib/arch/neon64/codec.so\lib/arch/ssse3/codec.so\lib/arch/sse41/codec.so\lib/arch/sse42/codec.so\lib/arch/avx/codec.so\lib/lib.so\lib/codec_choose.so\lib/tables/tables.soHAVE_AVX2=0HAVE_NEON32=0HAVE_NEON64=0HAVE_SSSE3=0HAVE_SSE41=0HAVE_SSE42=0HAVE_AVX=0#Theusershouldsupplycompilerflagsforthecodecstheywanttobuild.#Checkwhichcodecswe'regoingtoinclude:ifdefAVX2_CFLAGSHAVE_AVX2=1endififdefNEON32_CFLAGSHAVE_NEON32=1endififdefNEON64_CFLAGSHAVE_NEON64=1endififdefSSSE3_CFLAGSHAVE_SSSE3=1endififdefSSE41_CFLAGSHAVE_SSE41=1endififdefSSE42_CFLAGSHAVE_SSE42=1endififdefAVX_CFLAGSHAVE_AVX=1endififdefOPENMPCFLAGS+=-fopenmpendif.PHONY:allanalyzecleanall:bin/base64lib/libbase64.olib/libbase64.sobin/base64:bin/base64.olib/libbase64.olib/libbase64.so$(CC)$(CFLAGS)-o$@$^lib/libbase64.o:$(OBJS)$(LD)-r-o$@$^$(OBJCOPY)--keep-global-symbols=lib/exports.txt$@lib/libbase64.so:$(SOOBJS)$(LD)-shared-fPIC-o$@$^$(OBJCOPY)--keep-global-symbols=lib/exports.txt$@lib/config.h:@echo"#defineHAVE_AVX2$(HAVE_AVX2)">$@@echo"#defineHAVE_NEON32$(HAVE_NEON32)">>$@@echo"#defineHAVE_NEON64$(HAVE_NEON64)">>$@@echo"#defineHAVE_SSSE3$(HAVE_SSSE3)">>$@@echo"#defineHAVE_SSE41$(HAVE_SSE41)">>$@@echo"#defineHAVE_SSE42$(HAVE_SSE42)">>$@@echo"#defineHAVE_AVX$(HAVE_AVX)">>$@$(OBJS):lib/config.h$(SOOBJS):lib/config.h#olib/arch/avx2/codec.o:CFLAGS+=$(AVX2_CFLAGS)lib/arch/neon32/codec.o:CFLAGS+=$(NEON32_CFLAGS)lib/arch/neon64/codec.o:CFLAGS+=$(NEON64_CFLAGS)lib/arch/ssse3/codec.o:CFLAGS+=$(SSSE3_CFLAGS)lib/arch/sse41/codec.o:CFLAGS+=$(SSE41_CFLAGS)lib/arch/sse42/codec.o:CFLAGS+=$(SSE42_CFLAGS)lib/arch/avx/codec.o:CFLAGS+=$(AVX_CFLAGS)#solib/arch/avx2/codec.so:CFLAGS+=$(AVX2_CFLAGS)lib/arch/neon32/codec.so:CFLAGS+=$(NEON32_CFLAGS)lib/arch/neon64/codec.所以:CFLAGS+=$(NEON64_CFLAGS)lib/arch/ssse3/codec.so:CFLAGS+=$(SSSE3_CFLAGS)lib/arch/sse41/codec.so:CFLAGS+=$(SSE41_CFLAGS)lib/arch/sse42/codec.so:CFLAGS+=$(SSE42_CFLAGS)lib/arch/avx/codec.so:CFLAGS+=$(AVX_CFLAGS)%.o:%.c$(CC)$(CFLAGS)-o$@-c$<%.所以:%.c$(CC)$(CFLAGS)-shared-fPIC-o$@-c$
