github.com/wbrown/gpt_bpe@v0.0.0-20250709161131-1571a6e8ad2d/resources/data/llama3-tokenizer (about) config.json special_tokens_map.json tokenizer.json tokenizer_config.json vocab.bpe