Cyberspace of Shujun LI
Shortcuts
Large AI Models (LLMs, LVMs, etc.)
Leaderboards
(
Leaderboards)
LMArena
'><path d='M8.49243 8.38438L11.5451 9.76527L13.5787 5.90366C13.5787 5.90366 9.50694 7.88992 8.49243 8.38438Z' fill='%231B3882'/><path fill-rule='evenodd' clip-rule='evenodd' d='M17.9383 3.77734L15.7585 4.84051L12.9482 10.1769C12.5435 10.2222 12.122 10.345 11.698 10.5518C11.274 10.7586 10.8818 11.0327 10.5311 11.3558L6.31262 9.44756L4.13281 10.5107L4.56954 12.7575L8.78799 14.6658C8.72892 15.1208 8.73707 15.576 8.82202 16.013C8.90697 16.45 9.06186 16.8336 9.27227 17.1572L6.46204 22.4936L6.89877 24.7404L9.07858 23.6772L11.8888 18.3408C12.2935 18.2955 12.7151 18.1727 13.1391 17.9659C13.5631 17.7591 13.9553 17.485 14.306 17.1619L18.5244 19.0702L20.7042 18.007L20.2675 15.7602L16.0491 13.8519C16.1081 13.3969 16.1 12.9418 16.015 12.5047C15.9301 12.0677 15.7752 11.6841 15.5648 11.3605L18.375 6.02413L17.9383 3.77734ZM12.6532 15.3793C13.249 15.0887 13.6353 14.3552 13.5159 13.741C13.3966 13.1268 12.8167 12.8645 12.2208 13.1551C11.6249 13.4457 11.2386 14.1792 11.358 14.7934C11.4774 15.4076 12.0573 15.6699 12.6532 15.3793Z' fill='%231B3882'/><path d='M16.7781 12.1325L18.8117 8.27091L19.8308 13.5134L16.7781 12.1325Z' fill='%231B3882'/><path d='M16.3446 20.1333L13.292 18.7524L11.2584 22.614L16.3446 20.1333Z' fill='%231B3882'/><path d='M8.0589 16.3852L6.02531 20.2468L5.00628 15.0043L8.0589 16.3852Z' fill='%231B3882'/><path d='M12.0432 24.0216L16.2771 23.8792L17.1294 21.5409C17.1294 21.5409 13.0579 23.5269 12.0432 24.0216Z' fill='%235878B4'/><path fill-rule='evenodd' clip-rule='evenodd' d='M21.4891 19.4146L19.3092 20.4777L18.1314 23.7091C17.6641 23.8351 17.2084 24.0019 16.7844 24.2087C16.3604 24.4155 16.0023 24.6456 15.7142 24.888L9.8634 25.0848L7.68359 26.1479L9.13281 27.0891L14.9836 26.8923C15.1 27.1211 15.2971 27.3326 15.579 27.5156C15.8609 27.6987 16.2048 27.8386 16.5906 27.9359L15.4128 31.1673L16.862 32.1084L19.0418 31.0453L20.2197 27.8139C20.687 27.6879 21.1426 27.5211 21.5666 27.3143C21.9906 27.1075 22.3487 26.8774 22.6368 26.635L28.4876 26.4382L30.6675 25.3751L29.2182 24.4339L23.3674 24.6307C23.251 24.4019 23.0539 24.1904 22.772 24.0074C22.4901 23.8243 22.1463 23.6844 21.7604 23.5871L22.9383 20.3557L21.4891 19.4146ZM19.9185 26.2265C20.5143 25.9358 20.6762 25.4917 20.2801 25.2344C19.8839 24.9771 19.0797 25.0042 18.4838 25.2948C17.8879 25.5854 17.726 26.0296 18.1222 26.2869C18.5183 26.5441 19.3226 26.5171 19.9185 26.2265Z' fill='%235878B4'/><path d='M23.5351 23.6352L24.3875 21.2968L27.769 23.4928L23.5351 23.6352Z' fill='%235878B4'/><path d='M26.3078 27.5014L22.074 27.6438L21.2216 29.9821L26.3078 27.5014Z' fill='%235878B4'/><path d='M14.8159 27.8878L13.9635 30.2262L10.582 28.0302L14.8159 27.8878Z' fill='%235878B4'/><path d='M22.4496 5.68988L24.6091 9.38263L25.7496 7.83289C25.7496 7.83289 23.1079 6.11762 22.4496 5.68988Z' fill='%2336569B'/><path fill-rule='evenodd' clip-rule='evenodd' d='M28.5781 9.66976L27.1638 8.75133L25.5878 10.8929C25.3158 10.5949 25.0371 10.3479 24.762 10.1692C24.4869 9.99056 24.2376 9.89476 24.0196 9.87449L21.0354 4.77146L19.6211 3.85303L20.0578 6.09981L23.042 11.2028C23.0303 11.5354 23.0642 11.9239 23.1492 12.3609C23.2341 12.7979 23.3632 13.2482 23.5263 13.6943L21.9503 15.8359L22.3871 18.0827L23.8013 19.0011L25.3773 16.8595C25.6494 17.1575 25.928 17.4046 26.2031 17.5832C26.4782 17.7619 26.7275 17.8577 26.9456 17.8779L29.9298 22.981L31.3441 23.8994L30.9073 21.6526L27.9231 16.5496C27.9348 16.217 27.9009 15.8286 27.816 15.3916C27.731 14.9545 27.6019 14.5042 27.4388 14.0581L29.0148 11.9165L28.5781 9.66976ZM25.7118 15.0106C26.0984 15.2617 26.315 14.9673 26.1957 14.3531C26.0763 13.7389 25.6661 13.0375 25.2795 12.7864C24.8928 12.5354 24.6762 12.8297 24.7956 13.4439C24.915 14.0581 25.3252 14.7596 25.7118 15.0106Z' fill='%2336569B'/><path d='M28.3111 15.7131L29.4515 14.1633L30.4706 19.4058L28.3111 15.7131Z' fill='%2336569B'/><path d='M28.5155 22.0625L26.356 18.3698L25.2156 19.9195L28.5155 22.0625Z' fill='%2336569B'/><path d='M22.654 12.0393L21.5136 13.5891L20.4946 8.3466L22.654 12.0393Z' fill='%2336569B'/></g><defs><clipPath id='clip0_952_28230'><rect width='28.8' height='28.8' fill='white' transform='translate(3 3.59961)'/></clipPath></defs></svg>)
(
Leaderboard)
llm-stats.com
(
Leaderboards)
Scale AI's SEAL LLM Leaderboards
LiveBench: A Challenging, Contamination-Free LLM Benchmark (ICLR'2025)
MCP-Universe: Benchmarking Large Language Models with Real-World Model Context Protocol Servers (arXiv.org 2025)
Social Bias Leaderboard
LLM-Leaderboard (community-based)
CanAiCode Leaderboard
Big Code Models Leaderboard
Massive Text Embedding Benchmark (MTEB) leaderboard
Berkeley Function-Calling Leaderboard
UGI (Uncensored General Intelligence) Leaderboard
Open ASR Leaderboard
LLM APIs and Protocols
OpenRouter

(

)
Jun Siang Cheah's Free LLM API resources

(
@GitHub)
Multi-Modal LLMs
Google AI Studio
Pure Textual LLMs
chat.z.ai (GLM)
X-Master: Can We Lead on Humanity’s Last Exam? (arXiv.org 2025)
LVMs
AIGCBench
Ziqi Huang's Awesome Evaluation of Visual Generation
Google's Nano Banana Pro 🍌
Google's Imagen 4
Open AI's DALL·E 3
Stability AI
(
Stable Diffusion,
Visual ChatGPT)
Midjourney
ByteDance Seedream 4.0
Bytedance's MagicArena
ImagineArt
Bing Image Creator
Dream by WOMBO
AI+GUI
UI-TARS: Pioneering Automated GUI Interaction with Native Agents (2025)
UI-TARS Desktop (2025)
Large AI Models vs Cyber Security, Safety and Privacy
Detection of LLM-Generated Texts
Awesome papers on LLMs detection
BUST: Benchmark for the Evaluation of System Detectors of LLM-Generated Text (NAACL 2024)
A Survey of Attributions for Large Language Models (2023)
Factcheck-GPT (2023)
DetectGPT (2022)
Large AI Models vs Safety
Zhenhong Zhou's Awesome LLM-Safety
Conversational AI groups from Tsinghua University's AISafetyLab
SafetyPrompts.com: A Living Catalogue of Open Datasets for LLM Safety (arXiv.org 2024-25)
Harmful Fine-tuning Attacks and Defenses for Large Language Models: A Survey (arXiv.org 2024-25)
LLM Conversation Safety (NAACL 2024)
OpenRedTeaming (arXiv.org 2024)
(
Safety Datasets used in LibrAI Leaderboard)
HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal (arXiv.org 2024)
(
GitHub)
BeaverTails: Towards Improved Safety Alignment of LLM via a Human-Preference Dataset (NeurIPS 2023)
Large AI Models for Cyber Security
Thomas Roccia's Awesome GPTs (Agents) for Cybersecurity
Jensen Liu's Awesome-LLM4Security
cckuailong's Awesome GPT + Security
LLM Hacker's Handbook (Forces Unseen)
GPTSecurity.info (中文)

(
Cybersecurity Benchmarks:
CyberSecEval 4,
CyberSOCEval,
AutoPatchBench;

;
Meta Llama Guard 4;
Prompt Guard 2;
LlamaFirewall)
NVIDIA's garak (LLM vulnerability scanner)
Microsoft ExCyTIn-Bench: Evaluating LLM agents on Cyber Threat Investigation
DeepTeam: An open-source framework to red team LLM systems
Benchmarking Benchmark Leakage in Large Langauge Models
CyberAlbSecOP's Awesome_GPT_Super_Prompting

(
Inspect Evals: a repository of community contributed LLM evaluations)
Cybench (ICLR 2025)
BountyBench (arXiv.org 2025)
Cybersecurity AI Benchmark (CAIBench): Meta-benchmark for evaluating Cybersecurity AI agents (arXiv.org 2025)
CyberGym (arXiv.org 2025)
H-CoT + Malicious-Educator Benchmark (arXiv.org 2025)
Foundation-Sec-8B (Llama-3.1-FoundationAI-SecurityLLM-base-8B) (arXiv.org 2025)

(

,
GitHub;
alias1: Unrestricted Cybersecurity LLM)
PentestGPT (USENIX Security 2024)
When LLMs Meet Cybersecurity: A Systematic Literature Review (Cybersecurity 2024)
Malla: Demystifying Real-world Large Language Model Integrated Malicious Services (USENIX Security 2024)
JailbreakZoo (arXiv.org 2024)
Apart Research's Catastrophic Cyber Capabilities Benchmark (3CB) (arXiv.org 2024)
CySecBERT (ACMTOPS 2024)
CyBERTuned (NAACL Findings 2024)
HackAPrompt (EMNLP 2023)
Other Resources and Tools
Hannibal046's Awesome-LLM
Awesome Machine Generated Text
A Survey on Language, Multimodal, and Scientific GPT Models: Examing User-Friendly and Open-Sourced Large GPT Models
Prompt Engineering Guide
AI Alignment: A Comprehensive Survey (PKU 2023-)
Min Woo (Daniel) Park's Open-LLM-datasets
PleIAs's Common Corpus
PleIAs's OpenCulture
PleIAs's Toxic Commons
PleIAs's Finance Commons
Awesome-Medical-Healthcare-Dataset-For-LLM

(
GitHub)
Explainable and Interpretable AI (XAI)
Foundation Model Transparency Index (FMTI)
Fair-LLM-Benchmark ("Bias and Fairness in Large Language Models: A Survey" 2024)
Aequitas: Bias Auditing & "Correction" Toolkit
AI Fairness 360 (AIF360)
(
KDD 2023 Tutorial)
Data and code of the paper "Dissecting racial bias in an algorithm used to manage the health of populations" (Science Magazine 2019)

(
XAITK-Saliency)
XNLP: XAI for Natural Language Processing

(
GitHub,
Analysis360: Open Implementations of LLM Analyses)
Non-LLM Natural Language Processing and Computational Linguistics
General Tools:
NLTK (Natural Language Toolkit)
spaCy
(

)

(
GitHub)
Natural
CogCompNLP
Hugging Face
(
datasets;
Write With Transformer)
Talk to Transformer (InferKit online demo)
quanteda: Quantitative Analysis of Textual Data in R
(
GitHub)
gensim – Topic Modelling in Python
Transformer-XL
bert-as-service
BERTweet: A pre-trained language model for English Tweets (EMNLP 2020)
RNNTagger
TreeTagger
Python Word Segmentation
Word Ninja
SymSpell
(
Python port: symspellpy)
Language Style Transfer (NIPS 2017)
GeoTxt (Transactions in GIS 2019)
Edinburgh Geoparser
GeoPy
XAI for Natural Language Processing (AACL-IJCNLP 2020)
Google's BERT

(
GitHub)

(
WuDaoCorpora;
GitHub,
GLM,
CLM;
BMInf)
BERTective (EACL 2021)
mauve-experiments (NeurIPS 2021)
SemEval: International Workshop on Semantic Evaluation
Chinese NLP Resources:
预训练模型仓库
百度ERNIE
Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab
(
鹏程.盘古α / PanGu-α)
awesome-chinese-nlp (Guan Wang)
“结巴”中文分词
THUAIPoet (九歌) research group
(
九歌V2.0;
BERT-CCPoem,
MixPoet @ AAAI 2020,
Stylistic Poetry @ EMNLP 2018,
WMPoetry @ IJCAI 2018;
中国古典诗歌匹配数据集 / CCPM = Chinese Classical Poetry Matching Dataset,
Other datasets)
少女诗人小冰
tensorflow_poems / LiBai AI Composer / 中文古诗自动作诗机器人
中文语料小数据
Datasets:
Nicolas Iderhoff's nlp-datasets
WordNet
Wikimedia Downloads

(
Frequency lists)
WordNet
Amazon MASSIVE dataset
WebNLG Challenge
Wiktextract
(
data @ kaikki.org)
Use of corpora in translation studies @ Centre for Translation Studies, University of Leeds
OpenLexicon
Lexique
(
WorldLex: Blog, Twitter and Newspapers Word Frequencies for 66 languages)
Datasets of Automatic Keyphrase Extraction @ LIAAD, INESCTEC
KPTimes Corpus @ INLG 2019
dewiki-wordrank
OAGSX Title Generation Dataset
OAGKX Keyword Generation Dataset
GeoNames
Awesome LLM-generated Text Detection (2023)
Awesome papers on LLMs detection (2023)
M4: Multi-generator, Multi-domain, and Multi-lingual Black-Box Machine-Generated Text Detection (2023)
Privacy-related resources:

(
PrivaSeer Corpus @ ACL 2021,
PrivBERT @ ACL 2021)
Federated Learning
General Resources:
Awesome-Federated-Learning
The Federated Learning Portal
Open-source Tools:
TensorFlow Federated (TFF)
(
GitHub)
NVIDIA Clara
FedML: A Research Library and Benchmark for Federated Machine Learning

(
GitHub)

(
Federated Learning Research at Webank AI)
Commercial Solutions:
Disclaimer
All information on this website is for personal use and Shujun Li is not responsible for any misuse of information provided. The listed links on any page do not indicate any personal recommendations for any purposes for the visitors of this website, as each link is included for a different reason meaningful for Shujun Li's personal use. Logo files of websites are used to facilitate recognition of the external links, and does not represent endorsement of the corresponding websites for the content of this website. If the use of any logo file violates the copyrights or policies of any individuals or organisations, please contact Shujun Li so that he can removes the logo file or the whole link. Please also help report broken links and broken images on this website.