GoogleALBERT模型V2+中文版来了，GitHub热榜第二

时间：2023-03-14 19:37:35 科技观察

本文经AI新媒体量子位（公众号ID：QbitAI）授权转载，转载请联系出处。比BERT模型参数小18倍，性能也超越了它。这就是谷歌不久前发布的轻量级BERT模型——ALBERT。不仅如此，还横扫各大“性能排行榜”，在SQuAD和RACE测试上创造了新的SOTA。近期谷歌开源了中文版和Version2，该项目也在GitHub热榜上排名第二。GoogleALBERT模型V2+中文版来了，GitHub热榜第二》>ALBERT2性能再次提升。在这个版本中，“nodropout”、“additionaltrainingdata”、“longtrainingtime”策略将应用于所有模型，对比第一代ALBERT的性能，结果如下，中文版谷歌ALBERT模型V2+来了，GitHub热榜第二。”>从性能对比，对于ALBERT-base、ALBERT-large和ALBERT-xlarge，v2版本比v1版本要好得多。解释采用上述三种策略的重要性。平均而言，ALBERT-xxlarge比v1略差，原因有以下两个：额外训练了1.5M步（两个模型的唯一区别是训练了1.5M和3M步）；对于v1，在BERT、Roberta和XLnet中，在给定的参数集中做了一点超参数搜索；对于v2，仅使用了除RACE之外的V1参数，其中使用了1e-5的学习率和0ALBERTDR。总的来说，Albert是BERT的轻量级版本，使用参数减少技术，允许大规模配置并克服以前的内存限制。GoogleALBERT模型V2+中文版来了，GitHub热榜第二》>Albert使用单模型设置，在GLUE基准测试中的表现：GoogleALBERT模型V2+中文版来了，热榜第二listonGitHub》>Albert-xxl使用单一模型设置，在SQuaD和RACEbenchmarks中的表现：谷歌ALBERT模型V2+中文版来了，GitHub热榜第二》>中文版下载地址Basehttps://storage.googleapis.com/albert_models/albert_base_zh.tar.gzLargehttps://storage.googleapis.com/albert_models/albert_large_zh.tar.gzXLargehttps://storage.googleapis.com/albert_models/albert_xlarge_zh.tar.gzXxlargehttps://storage.googleapis.com/albert_models/albert_xxlargetar.gzALBERTv2下载地址Base[TarFile]:https://storage.googleapis.com/albert_models/albert_base_v2.tar.gz[TF-Hub]:https://tfhub.dev/google/albert_base/2Large[Tar文件]：https://storage.googleapis.com/albert_models/albert_large_v2.tar.gz[TF-Hub]：https://tfhub.dev/google/albert_large/2XLarge[Tar文件]：https://storage.googleapis.com/albert_models/albert_xlarge_v2.tar.gz[TF-Hub]：https://tfhub.dev/google/albert_xlarge/2Xxlarge[Tar文件]：https://storage.googleapis.com/albert_models/albert_xxlarge_v2.tar.gz[TF-Hub]：https://tfhub.dev/google/albert_xxlarge/2训练模型可以使用TF-Hub模型：Base[TarFile]：https://storage.googleapis.com/albert_models/albert_base_v1.tar.gz[TF-Hub]：https://tfhub.dev/google/albert_base/1Large[TarFile]：https://storage.googleapis.com/albert_models/albert_large_v1.tar.gz[TF-Hub]：https://tfhub.dev/google/albert_large/1XLarge[Tar文件]：https://storage.googleapis.com/albert_models/albert_xlarge_v1.tar.gz[TF-Hub]：https://tfhub.dev/google/albert_xlarge/1Xxlarge[Tar文件]：https://storage.googleapis.com/albert_models/albert_xxlarge_v1.tar.gz[TF-Hub]：https://tfhub.dev/google/albert_xxlarge/1TF-Hub模块使用示例：tags=set()ifis_training:tags.add("train")albert_module=hub.Module("https://tfhub.dev/google/albert_base/1",tags=tags,trainable=True)albert_inputs=dict(input_ids=input_ids,input_mask=input_mask,segment_ids=segment_ids)albert_outputs=albert_module(inputs=albert_inputs,signature="tokens",as_dict=True)#Ifyouwanttousethetoken-leveloutput,use#albert_outputs["sequence_output"]instead.output_layer=albert_outputs["pooled_output"]预训练说明预训练ALBERT，可以使用run_pretraining.py:pipinstall-ralbert/requirements.txtpython-malbert.run_pretraining\--input_file=...\--output_dir=...\--init_checkpoint=...\--albert_config_file=...\--do_train\--do_eval\--train_batch_size=4096\--eval_batch_size=64\--max_seq_length=512\--max_predictions_per_seq=20\--optimizer='lamb'\--learning_rate=.00176\--num_train_steps=125000\--num_warmup_steps=3125\--save_checkpoints_steps=5000Fine-tuningonGLUE有关GLUE的微调和评估，请参阅此项目中的run_glue.sh文件。低级用例可能希望直接使用run_classifier.py脚本。run_classifier.py对各种GLUE基准测试任务进行微调和评估。例如MNLI：pipinstall-ralbert/requirements.txtpython-malbert.run_classifier\--vocab_file=...\--data_dir=...\--output_dir=...\--init_checkpoint=...\--albert_config_file=...\--spm_model_file=...\--do_train\--do_eval\--do_predict\--do_lower_case\--max_seq_length=128\--optimizer=adamw\--task_name=MNLI\--warmup_step=1000\--learning_rate=3e-5\--train_step=10000\--save_checkpoints_steps=100\--train_batch_size=128每个GLUE任务的默认标志可以在run_glue.sh中找到。从TF-Hub模块开始微调模型：albert_hub_module_handle==https://tfhub.dev/google/albert_base/1评估后，脚本应报告以下输出：*****Evalresults*****global_step=...loss=...masked_lm_accuracy=...masked_lm_loss=...sentence_order_accuracy=...sentence_order_loss=...微调SQuAD要微调和评估SQuADv1上的预训练模型，使用运行SQuADv1。py脚本：pipinstall-ralbert/requirements.txtpython-malbert.run_squad_v1\--albert_config_file=...\--vocab_file=...\--output_dir=...\--train_file=...\--predict_file=...\--train_feature_file=...\--predict_feature_file=...\--predict_feature_left_file=...\--init_checkpoint=...\--spm_model_file=...\--do_lower_case\--max_seq_length=384\--doc_stride=128\--max_query_length=64\--do_train=true\--do_predict=true\--train_batch_size=48\--predict_batch_size=8\--learning_rate=5e-5\--num_train_epochs=2.0\--warmup_proportion=.1\--save_checkpoints_steps=5000\--n_best_size=20\--max_answer_length=30对于SQuADv2，使用运行SQuADv2.py脚本：pipinstall-ralbert/requirements.txtpython-马尔伯特.run_squad_v2\--albert_config_file=...\--vocab_file=...\--output_dir=...\--train_file=...\--predict_file=...\--train_feature_file=...\--predict_feature_file=...\--predict_feature_left_file=...\--init_checkpoint=...\--spm_model_file=...\--do_lower_case\--max_seq_length=384\--doc_stride=128\--max_query_length=64\--do_train\--do_predict\--train_batch_size=48\--predict_batch_size=8\--learning_rate=5e-5\--num_train_epochs=2.0\--warmup_proportion=.1\--save_checkpoints_steps=5000\--n_best_size=20\--max_answer_length=30传送门GitHub项目地址：https://github.com/google-research/ALBERT

上一篇：中国联通：预计2016年净利润同比下降94%

下一篇：如何优化服务器性能？

GoogleALBERT模型V2+中文版来了，GitHub热榜第二相关文章