Cyberspace of Shujun LI

Shortcuts

Large AI Models (LLMs, LVMs, etc.)

Leaderboards

(Leaderboards) Artificial Analysis LMArena CompassRank 司南 (Leaderboard) llm-stats.com (Leaderboards) EvalPlus Leaderboard (on coding related tasks) Scale AI's SEAL LLM Leaderboards Vellum LLM Leaderboard Aider LLM Leaderboards AlpacaEval Leaderboard: An Automatic Evaluator for Instruction-following Language Models LiveBench: A Challenging, Contamination-Free LLM Benchmark (ICLR'2025) MCP-Universe: Benchmarking Large Language Models with Real-World Model Context Protocol Servers (arXiv.org 2025) Social Bias Leaderboard LLM-Leaderboard (community-based) emotion icon Hugging Face's Open LLM Leaderboard CanAiCode Leaderboard Big Code Models Leaderboard Massive Text Embedding Benchmark (MTEB) leaderboard Berkeley Function-Calling Leaderboard UGI (Uncensored General Intelligence) Leaderboard Open ASR Leaderboard

LLM APIs and Protocols

OpenRouter Olamma Groq 火山引擎 (火山方舟) Jun Siang Cheah's Free LLM API resources emotion icon Model Context Protocol (MCP) (@GitHub)

Multi-Modal LLMs

Google Gemini Google AI Studio ChatGPT Qwen Chat Dola Microsoft Copilot Meta Llama

Pure Textual LLMs

emotion icon DeepSeek Chat Anthropic Claude Mistral chat.z.ai (GLM) Intern 书生大模型 LLM360 X-Master: Can We Lead on Humanity’s Last Exam? (arXiv.org 2025)

LVMs

AIGCBench Ziqi Huang's Awesome Evaluation of Visual Generation emotion icon Google's Nano Banana Pro 🍌 Google's Imagen 4 Open AI's DALL·E 3 Stability AI (Stable Diffusion, Visual ChatGPT) Midjourney ByteDance Seedream 4.0 Bytedance's MagicArena ImagineArt HiDream Vivago Bing Image Creator Dream by WOMBO Craiyon SVG.IO

AI+GUI

UI-TARS: Pioneering Automated GUI Interaction with Native Agents (2025) UI-TARS Desktop (2025) Midscene.js

Large AI Models vs Cyber Security, Safety and Privacy

Detection of LLM-Generated Texts

Awesome papers on LLMs detection RAID: A Shared Benchmark for Robust Evaluation of Machine-Generated Text Detectors (ACL'2024) BUST: Benchmark for the Evaluation of System Detectors of LLM-Generated Text (NAACL 2024) A Survey of Attributions for Large Language Models (2023) Factcheck-GPT (2023) DetectGPT (2022)

Large AI Models vs Safety

Zhenhong Zhou's Awesome LLM-Safety Safety at Scale: A Comprehensive Survey of Large Model Safety  (arXiv.org 2025) Conversational AI groups from Tsinghua University's AISafetyLab SafetyPrompts.com: A Living Catalogue of Open Datasets for LLM Safety (arXiv.org 2024-25) Harmful Fine-tuning Attacks and Defenses for Large Language Models: A Survey (arXiv.org 2024-25) LLM Conversation Safety (NAACL 2024) OpenRedTeaming (arXiv.org 2024) (Safety Datasets used in LibrAI Leaderboard) HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal (arXiv.org 2024) (GitHub) BeaverTails: Towards Improved Safety Alignment of LLM via a Human-Preference Dataset (NeurIPS 2023)

Large AI Models for Cyber Security

Thomas Roccia's Awesome GPTs (Agents) for Cybersecurity Jensen Liu's Awesome-LLM4Security cckuailong's Awesome GPT + Security LLM Hacker's Handbook (Forces Unseen) GPTSecurity.info (中文) emotion icon Meta AI's Purple Llama (Cybersecurity Benchmarks: CyberSecEval 4, CyberSOCEval, AutoPatchBench; CodeShield; Meta Llama Guard 4; Prompt Guard 2; LlamaFirewall) NVIDIA's garak (LLM vulnerability scanner) Microsoft ExCyTIn-Bench: Evaluating LLM agents on Cyber Threat Investigation DeepTeam: An open-source framework to red team LLM systems Benchmarking Benchmark Leakage in Large Langauge Models Tsinghua CCS Lab's Awesome-LM-SSP CyberAlbSecOP's Awesome_GPT_Super_Prompting emotion icon UK AISI's Inspect: An open-source framework for large language model evaluations (Inspect Evals: a repository of community contributed LLM evaluations) Safety Misalignment Against Large Language Models (NDSS 2025) Cybench (ICLR 2025) BountyBench (arXiv.org 2025) Cybersecurity AI Benchmark (CAIBench): Meta-benchmark for evaluating Cybersecurity AI agents (arXiv.org 2025) CyberGym (arXiv.org 2025) H-CoT + Malicious-Educator Benchmark (arXiv.org 2025) Foundation-Sec-8B (Llama-3.1-FoundationAI-SecurityLLM-base-8B) (arXiv.org 2025) Alias Robotics (Cybersecurity AI (CAI): A lightweight, open-source framework that empowers security professionals to build and deploy AI-powered offensive and defensive automation, GitHub; alias1: Unrestricted Cybersecurity LLM) PentestGPT (USENIX Security 2024) Awesome Red-Teaming LLMs (arXiv.org 2024) When LLMs Meet Cybersecurity: A Systematic Literature Review (Cybersecurity 2024) Malla: Demystifying Real-world Large Language Model Integrated Malicious Services (USENIX Security 2024) JailbreakZoo (arXiv.org 2024) Apart Research's Catastrophic Cyber Capabilities Benchmark (3CB) (arXiv.org 2024) CyberMetric Dataset (IEEE CSR 2024) CySecBERT (ACMTOPS 2024) CyBERTuned (NAACL Findings 2024) HackAPrompt (EMNLP 2023)

Other Resources and Tools

Hannibal046's Awesome-LLM Awesome Machine Generated Text A Survey on Language, Multimodal, and Scientific GPT Models: Examing User-Friendly and Open-Sourced Large GPT Models LLMDataHub: Awesome Datasets for LLM Training Prompt Engineering Guide Pretrain, Prompt, Predict LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models (ACL 2024) AI Alignment: A Comprehensive Survey (PKU 2023-) Min Woo (Daniel) Park's Open-LLM-datasets PleIAs's Common Corpus PleIAs's OpenCulture PleIAs's Toxic Commons PleIAs's Finance Commons Awesome-Medical-Healthcare-Dataset-For-LLM A Practical Guide for Medical Large Language Models Stanford STORM and Co-STORM (GitHub)

Explainable and Interpretable AI (XAI)

AI Risk Repository @ MIT Foundation Model Transparency Index (FMTI) Fair-LLM-Benchmark ("Bias and Fairness in Large Language Models: A Survey" 2024) Aequitas: Bias Auditing & "Correction" Toolkit AI Fairness 360 (AIF360) (KDD 2023 Tutorial) Data and code of the paper "Dissecting racial bias in an algorithm used to manage the health of populations" (Science Magazine 2019) XAITK (XAITK-Saliency) Thermostat: A Large Collection of NLP Model Explanations and Analysis Tools emotion icon XNLP: XAI for Natural Language Processing LLM360: Fully Transparent Open-Source LLMs (GitHub, Analysis360: Open Implementations of LLM Analyses)

Non-LLM Natural Language Processing and Computational Linguistics

General Tools: NLTK (Natural Language Toolkit) Stanford CoreNLP (Java) TextBlob spaCy (textacy: NLP, before and after spaCy) PyTorch-NLP (GitHub) NLP.js Natural Apache OpenNLP CogCompNLP Hugging Face (datasets; Write With Transformer) Talk to Transformer (InferKit online demo) quanteda: Quantitative Analysis of Textual Data in R (GitHub) Linguistic Inquiry and Word Count (LIWC) gensim – Topic Modelling in Python BERTopic Transformers Transformer-XL bert-as-service BERTweet: A pre-trained language model for English Tweets (EMNLP 2020) COVID-Twitter-BERT (CT-BERT) RNNTagger TreeTagger Python Word Segmentation Word Ninja SymSpell (Python port: symspellpy) Language Style Transfer (NIPS 2017) GeoTxt (Transactions in GIS 2019) Edinburgh Geoparser GeoPy XAI for Natural Language Processing (AACL-IJCNLP 2020) GENIE (a leaderboard for natural language generation tasks) emotion icon Google's BERT Microsoft DeepSpeed (GitHub) 悟道 (Wudao) (WuDaoCorpora; GitHub, GLM, CLM; BMInf) PaddleNLP BERTective (EACL 2021) mauve-experiments (NeurIPS 2021) emotion icon SemEval: International Workshop on Semantic Evaluation

Chinese NLP Resources: 预训练模型仓库 百度ERNIE Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab (鹏程.盘古α / PanGu-α) 中文BERT-wwm系列模型 awesome-chinese-nlp (Guan Wang) “结巴”中文分词 THUAIPoet (九歌) research group (九歌V2.0; BERT-CCPoem, MixPoet @ AAAI 2020, Stylistic Poetry @ EMNLP 2018, WMPoetry @ IJCAI 2018; 中国古典诗歌匹配数据集 / CCPM = Chinese Classical Poetry Matching Dataset, Other datasets) 少女诗人小冰 tensorflow_poems / LiBai AI Composer / 中文古诗自动作诗机器人 中文语料小数据

Datasets: Nicolas Iderhoff's nlp-datasets WordNet Wikimedia Downloads Wiktionary (Frequency lists) WordNet Amazon MASSIVE dataset WebNLG Challenge Wiktextract (data @ kaikki.org) British National Corpus Use of corpora in translation studies @ Centre for Translation Studies, University of Leeds OpenLexicon Lexique (WorldLex: Blog, Twitter and Newspapers Word Frequencies for 66 languages) Datasets of Automatic Keyphrase Extraction @ LIAAD, INESCTEC KPTimes Corpus @ INLG 2019 dewiki-wordrank OAGSX Title Generation Dataset OAGKX Keyword Generation Dataset GeoNames Awesome LLM-generated Text Detection (2023) Awesome papers on LLMs detection (2023) M4: Multi-generator, Multi-domain, and Multi-lingual Black-Box Machine-Generated Text Detection (2023)

Privacy-related resources: PrivaSeer (PrivaSeer Corpus @ ACL 2021, PrivBERT @ ACL 2021)

Federated Learning

General Resources: Google AI's Federated Learning online comic Awesome-Federated-Learning The Federated Learning Portal

Open-source Tools: TensorFlow Federated (TFF) (GitHub) NVIDIA Clara FedML: A Research Library and Benchmark for Federated Machine Learning FedML-AI (GitHub) emotion icon FedAI WeBank AI's Federated AI Ecosystem (Federated Learning Research at Webank AI)

Commercial Solutions: Owkin Rhino Health

Disclaimer

All information on this website is for personal use and Shujun Li is not responsible for any misuse of information provided. The listed links on any page do not indicate any personal recommendations for any purposes for the visitors of this website, as each link is included for a different reason meaningful for Shujun Li's personal use. Logo files of websites are used to facilitate recognition of the external links, and does not represent endorsement of the corresponding websites for the content of this website. If the use of any logo file violates the copyrights or policies of any individuals or organisations, please contact Shujun Li so that he can removes the logo file or the whole link. Please also help report broken links and broken images on this website.