Cyberspace of Shujun LI
Shortcuts
Shujun's Publications
-
Çağrı B. Aslan, Shujun Li, Fatih V. Çelebi and Hao Tian, "The World of Defacers: Looking through the Lens of Their Activities on Twitter," IEEE Access, Volume 8, pp. 204132-204143, IEEE, 2020
© Authors
-
Keenan Jones, Jason R. C. Nurse and Shujun Li, "Behind the Mask: A Computational Study of Anonymous’ Presence on Twitter," in Proceedings of 14th International Conference on Web and Social Media (ICWSM 2020), pp. 327-338, 2020 (one of eight Honourable Mentions for Best Paper Award; acceptance rate of full papers: 72/298=24%)
© AAAI
-
Rahime Belen Sağlam, Çağrı B. Aslan, Shujun Li, Lisa Dickson and Ganna Pogrebna, "A Data-Driven Analysis of Blockchain Systems' Public Online Communications on GDPR," in Proceedings of 2020 IEEE International Conference on Decentralized Applications and Infrastructures (IEEE DAPPS 2020), pp. 22-31, 2020
© IEEE
-
Zeynep Chousein, Hacı Yakup Tetik, Rahime Belen Sağlam, Abdullah Bülbül and Shujun Li, "Tension between GDPR and Public Blockchains: A Data-Driven Analysis of Online Discussions," in Proceedings of 13th International Conference on Security of Information and Networks (SINCONF 2020), Article No. 17, 8 pages, 2020 © ACM
-
Kübra Aydin, Rahime Belen Sağlam, Shujun Li and Abdullah Bülbül, "When GDPR Meets CRAs (Credit Reference Agencies): Looking through the Lens of Twitter," in Proceedings of 13th International Conference on Security of Information and Networks (SINCONF 2020), Article No. 16, 8 pages, 2020 © ACM
-
Yang Lu and Shujun Li, "From Data Flows to Privacy Issues: A User-Centric Semantic Model for Representing and Discovering Privacy Issues," in Proceedings of the 53rd Hawaii International Conference on System Sciences (HICSS 2020), pp. 6528-6537, University of Hawaiʻi at Mānoa, 2020, DOI:10.24251/HICSS.2020.799
© Authors
-
Çağrı B. Aslan, Rahime Belen Sağlam and Shujun Li, "Automatic Detection of Cyber Security Related Accounts on Online Social Networks: Twitter as an example," in Proceedings of 9th International Conference on Social Media and Society (SMSociety 2018), pp. 236-240, ACM, 2018
© Authors
-
Nouf Aljaffan, Haiyue Yuan and Shujun Li, "PSV (Password Security Visualizer): From Password Checking to User Education," in Human Aspects of Information Security, Privacy and Trust: 5th International Conference, HAS 2017, Held as Part of HCI International 2017, Vancouver, BC, Canada, July 9-14, 2017, Proceedings, Lecture Notes in Computer Science, vol. 10292, pp. 191-211, 2017
© Springer
e-Data and Data Analytics Services:
Wolfram Data Repository
The GDELT Project: A Global Database of Society

(

)
data.europa.eu
JRC (Joint Research Centre) Data Catalogue
data.police.uk
mldata (machine learning data set repository)
MLcomp datasets
Google Dataset Search
Google Public Data Explorer
Common Crawl
Elicit: The AI Research Assistant

(
Free Company Dataset,
Largest US Employers by Metro Dataset,
Free Job Title Dataset,
Free Engineering Skills Dataset)
Awesome Public Datasets
Network Repository: An Interactive Scientific Network Data Repository
University Domains and Names Data List & API
D3.js Graph Gallery
The Data Visualisation Catalogue
informationisbeautiful.net
DataGenetics

(
GitLab)
Newspaper3k: Article scraping & curation
Otter.ai

(

;

;
DBpedia MARVIN Release Bot;
DBpedia Information Extraction Framework;
DBpedia Forum;

)

(

)
Personal Data Management Platforms:

(

,

,

,
Documentation for Developers)
openPDS/SafeAnswers: Personal Data with Privacy
False Information
Organizations, Tools and Resources:
Journalism, 'Fake News' and Disinformation: A Handbook for Journalism Education and Training (UNESCO)
Combating the disinfodemic: Working for truth in the time of COVID-19 (UNESCO)
Combating the Disinfodemic: Working for truth in the time of COVID-19 (UNESCO and UNITAR Divisions for Multilateral Diplomacy and Prosperity's mobile e-learning course)
WHO's Information Network for Epidemics (EPI-WIN)
W3C Credible Web Community Group
(
Github,
Credible Web CG Area-2 (Corroboration-Based Strategies))

(

,
A Short Guide to the History of ‘Fake News’ and Disinformation: A New ICFJ Learning Module)

(
Digital News Innovation (DNI) Fund,
GNI Innovation Challenges)
Truth Decay @ RAND
(
Fighting Disinformation Online: A Database of Web Tools)

(
Firefox extension - NewsGuard,
Google Chrome extension - NewsGuard,
Firefox extension - HealthGuard,
Android app - NewsGuard;
COVID-19 Misinformation Resources,
Coronavirus Misinformation Tracking Center)
CheckStep
misinformation datasets @ data.world
FakeNewsTracker
Google Fact Check
(
Google Fact Check Tools API,
Google Fact Check Explorer,
Google Fact Check Markup Tool)
Fact-check Feed @ fact.pubmedia.us
SMAT: The Social Media Analysis Toolkit
Verifi!
News Landscape (NELA) Toolkit

(

)
Fake News Challenge (FNC)
(
Stance Detection dataset for FNC-1)

(
International Fact-Checking Network - IFCN,
IFCN Code of Principles;
#CoronaVirusFacts Alliance,
CoronaVirusFacts/DatosCoronaVirus Alliance Database)
Content Authenticity Initiative (CAI)

(

)
BBC Disinformation Watch

(
Fact-Checking,
The Duke Tech & Check Cooperative,

)
COVID-19 Misinformation Newsletter @ Programme on Democracy and Technology (DemTech), Oxford University

(
Anti-Misinformation Resources: The Catalog)
Misinformation Exposure (Nature Communications 2022)
Arkose Labs
(
Fake Reviews,
Fake Users)
TheReviewIndex
Yelp Open Dataset
Fake Reviews Dataset (Journal of Retailing and Consumer Services 2022)
YelpCHI dataset (ICWSM 2013 and KDD 2015)
YelpZip dataset (KDD 2015 and SIAM SDM 2016)
Yelp-Fraud (Multi-relational Graph Dataset for Yelp Spam Review Detection) (CIKM 2020)
Amazon-Fraud (Multi-relational Graph Dataset for Amazon Fraudulent Account Detection) (CIKM 2020)
Masterpiece Generator
fake-resume-generator
Multimedia False Information:
JPEG Fake Media
Awesome Deepfakes
fake-face-detection: some collected paper and personal notes relevant to Fake Face Detetection
DeepFake-o-meter: An open platform integrating state-of-the-art DeepFake detection methods
StyleGAN
StyleGAN2
StyleGAN2-ADA (Official TensorFlow implementation)
StyleGAN2-ADA (Official PyTorch implementation)
DeepFaceLab
This Person Does Not Exist
OpenAI's DALL·E 2
(
Stability Generator API @ GitHub)
This Person Does Not Exist
thiscatdoesnotexist.com
thishorsedoesnotexist.com
thisartworkdoesnotexist.com
thischemicaldoesnotexist.com
thesecatsdonotexist.com
thisanimedoesnotexist.ai
thisponydoesnotexist.net
thiswaifudoesnotexist.net
whichfaceisreal.com
spotdeepfakes.org
generated.photos
(
datasets,
face fenerator,
free generated photos)
DeepFaceDrawing: Deep Generation of Face Images from Sketches (SIGGRAPH 2020)
(
DeepFaceDrawing-Jittor @ GitHub)
DeepNude-an-Image-to-Image-technology
pix2pix: Image-to-Image Translation with Conditional Adversarial Nets (CVPR 2017)
(
CycleGAN and pix2pix in PyTorch @ GitHub,
original code @ GitHub,
Christopher Hesse's interactive demo)
Deepware Scanner
(
GitHub)
Flickr-Faces-HQ Dataset (FFHQ)
Adversarial Deepfakes (WACV 2021)
DefakeHop (ICME 2021)
DeepFaceLab

(
GitHub)
MyFakeApp (based on Faceswap)
Botika
MyVoiceYourFace.com
FaceForensics Benchmark
Partnership on AI's AI and Media Integrity Steering Committee
(
Deepfake Detection Challenge = DFDC)
Celeb-DF (v2): A New Dataset for DeepFake Forensics (CVPR 2020)
(
GitHub)
KoDF: A Large-scale Korean DeepFake Detection Dataset (CVPR 2021)
CoMoFoD - Image Database for Copy-Move Forgery Detection
Copy-Move Forgery Database with Similar but Genuine Objects (COVERAGE)
GANDCTAnalysis
MKLab-ITI's image-verification-corpus
Assembler (Google's project)
A corpus of debunked and verified user-generated videos (Online Information Review 2019)
Fake-EmoReact 2021 Challenge @ SocialNLP 2021
EmotionGIF 2020 Challenge @ SocialNLP 2020
NewsCLIPpings: Automatic Generation of Out-of-Context Multimodal Media (EMNLP 2021)
Visual News: Benchmark and Challenges in News Image Captioning (EMNLP 2021)
Other Research Related:
CLEF2020 CheckThat! Lab (Enabling Automatic Identification and Verification of Claims in Social Media)
CLEF2019 CheckThat! Lab
CLEF2018 CheckThat! Lab
FEVER Datasets (scientific claims)
ClaimBuster: Automated Live Fact-checking
(
ClaimPortal,
ICWSM 2020 dataset)
Claim Detection in Social Media via Fusion of Transformer and Syntactic Features (CLEF CheckThat! 2020)
ClaimsKG
claim-rank (RANLP 2017)
Claim Extraction for Scientific Publications
SciFact
(
GitHub)
Too Many Claims to Fact-Check: Prioritizing Political Claims Based on Check-Worthiness (MAISoN'2020 @ CIKM'2020)
entity-fishing - Entity Recognition and Disambiguation
Full Fact's Fast & Furious Fact Check Challenge (2016)

(
Iffy Index of Unreliable Sources,
Wayback Workshop)
OSoMe (Observatory on Social Media) @ Network Science Institute (IUNI), Center for Complex Networks and Systems Research (CNetS), Indiana University
(
Tools and Datasets:
Hoaxy®,

,

,
BotSlayer,
CoVaxxy,

)
Graph-based Fraud Detection Papers and Resources
VoterFraud2020
(
@ GitHub,
@ Fighshare)
FakeNewsNet
Maciej Szpakowski's Fake News Corpus
Fakeddit
(
GitHub)
Credibilator
(
Google Chrome extension)
LOCO: the 88-million word language of conspiracy corpus (2021)
Avax (anti-vaccine) tweets dataset (2021)
The COVID-19 Infodemic: Can the Crowd Judge Recent Misinformation Objectively? (SIGIR 2020 + ECIR 2020 + CIKM 2020)
ReCOVery: A Multimodal Repository for COVID-19 News Credibility Research (CIKM 2020)
FakeCovid: Fact Checked data for COVID-19 (ICWSM 2020 workshop)
Dataset for COVID-19 Misinformation on Twitter (2020)
CHECKED: Chinese COVID-19 Fake News Dataset (2020)
Factuality and Bias Prediction of News Media (ACL 2020 + EMNLP 2018)
FakeHealth repository (ICWSM 2020)
FiveThirtyEight's dataset of 3 million Russian troll tweets
Raiders of the Lost Kek: 3.5 Years of Augmented 4chan Posts from the Politically Incorrect Board (ICWSM 2020)
Learning from Fact-checkers (SIGIR 2019)
The Rise of Guardians (SIGIR 2018)
LIAR-PLUS fake news databse (FEVER 2018)
LIAR fake news databse (ACL 2017)
CREDBANK-data (ICWSM 2015)
中文谣言数据 (中国科学: 信息科学 2015)
Information Visualization
Tools:
Transparency Vis
More
Tools:
GetOldTweets-python
GetOldTweets-java
GetOldTweets3
Data:
COVID-19 @ Aminer
(
COVID-19 Open Datasets,
dashboard)
Footer

|
|
|
China
Germany (CET)
|