以下内容转自https://blog.csdn.net/qq_32447301/article/details/79487335
金融
- 美国劳工部统计局官方发布数据
- 上证A股日线数据,1999.12.09 至 2016.06.08,前复权,1095支股票
- 深证A股日线数据,1999.12.09 至 2016.06.08,前复权,1766支股票
- 深证创业板日线数据,1999.12.09 至 2016.06.08,前复权,510支股票
- MT4平台外汇交易历史数据
- Forex平台外汇交易历史数据
- 几组外汇交易逐笔(Ticks)数据
- 美国股票新闻数据【Kaggle数据】
- 美国医疗保险市场数据【Kaggle数据】
- 美国金融客户投诉数据【Kaggle数据】
- Lending Club 网贷违约数据【Kaggle数据】
- 信用卡欺诈数据【Kaggle 数据】
- 某个金融产品实时交易数据【Kaggle数据】
- 美国股票数据XBRL【Kaggle数据】
- 纽约股票交易所数据【Kaggle数据】
交通
商业
推荐系统
医疗健康
- 人识别物体时大脑核磁共振影像数据
- 人理解单词时大脑核磁共振影像数据
- 心脏病心房图像及标注数据
- 细胞病理识别
- FIRE 视网膜眼底病变图像数据
- 食物营养成分数据 【Kaggle数据】
- EGG 大脑电波形状数据【Kaggle数据】
图像数据
综合图像
- Visual Genome 图像数据
- Visual7w 图像数据
- COCO 图像数据
- SUFR 图像数据
- ILSVRC 2014 训练数据(ImageNet的一部分)
- PASCAL Visual Object Classes 2012 图像数据
- PASCAL Visual Object Classes 2011 图像数据
- PASCAL Visual Object Classes 2010 图像数据
- 80 Million Tiny Image 图像数据【数据太大仅有介绍】
- ImageNet【数据太大仅有介绍】
场景图像
Web标签图像
人形轮廓图像
视觉文字识别图像
- Street View House Number 门牌号图像数据
- MNIST 手写数字识别图像数据
- 3D MNIST 数字识别图像数据【Kaggle数据】
- MediaTeam Document 文档影印和内容数据
特定一类事物图像
- 著名的猫图像标注数据
- Caltech-UCSD Birds200 鸟类图像数据
- Stanford Car 汽车图像数据
- Cars 汽车图像数据
- MIT Cars 汽车图像数据
- Stanford Cars 汽车图像数据
- Food-101 美食图像数据
- 17_Category_Flower 图像数据
- 102_Category_Flower 图像数据
- UCI Folio Leaf 图像数据
- Labeled Fishes in the Wild 鱼类图像
- 美国 Yelp 点评网站酒店照片
- CMU-Oxford Sculpture 塑像雕像图像
- Oxford-IIIT Pet 宠物图像数据
材质纹理图像
物体分类图像
- COIL-20 图像数据
- COIL-100 图像数据
- Caltech-101 图像数据
- Caltech-256 图像数据
- CIFAR-10 图像数据
- CIFAR-100 图像数据
- STL-10 图像数据
- LabelMe_12_50k图像数据
- NORB v1.0 图像数据
- NEC Toy Animal 图像数据
- iCubWorld 图像分类数据
- Multi-class 图像分类数据
- GRAZ 图像分类数据
人脸图像
- IMDB-WIKI 500k+ 人脸图像、年龄性别数据
- Labeled Faces in the Wild 人脸数据
- Extended Yale Face Database B 人脸数据
- Bao Face 人脸数据
- DC-IGN 论文人脸数据
- 300 Face in Wild 图像数据
- BioID Face 人脸数据
- CMU Frontal Face Images
- FDDB_Face Detection Data Set and Benchmark
- NIST Mugshot Identification Database
- Faces in the Wild 人脸数据
- CelebA 名人人脸图像数据
- VGG Face 人脸图像数据
姿势动作图像
指纹识别
其它图像数据
视频数据
综合视频
人类动作视频
- Microsoft Research Action 人类动作视频数据
- UCF50 Action Recognition 动作识别数据
- UCF101 Action Recognition 动作识别数据
- UT-Interaction 人类动作视频数据
- UCF iPhone 运动中传感器数据
- UCF YouTube 人类动作视频数据
- UCF Sport 人类动作视频数据
- UCF-ARG 人类动作视频数据
- HMDB 人类动作视频
- HOLLYWOOD2 人类行为动作视频数据
- Recognition of human actions 动作视频数据
- Motion Capture 动作捕捉视频数据
- SBU Kinect Interaction 肢体动作视频数据
行人检测视频
- UCSD Pedestrian 行人视频数据
- Caltech Pedestrian 行人视频数据
- ETH 行人视频数据
- INRIA 行人视频数据
- TudBrussels 行人视频数据
- Daimler 行人视频数据
密集人群视频
其它视频
音频数据
综合音频
Google Audioset 音频数据【数据太大仅有介绍】
语音识别
- Sinhala TTS 英语语音识别
- TIMIT 美式英语语音识别数据
- LibriSpeech ASR corpus 语音数据
- Room Impulse Response and Noise 语音数据
- ALFFA 非洲语音数据
自然语言处理
- RCV1英语新闻数据
- 20news 英语新闻数据
- First Quora Release Question Pairs
- JRC Names各国语言专有实体名称
- Multi-Domain Sentiment V2.0
- LETOR 信息检索数据
- Yale Youtube Vedio Text
- 斯坦福问答数据【Kaggle数据】
- 美国假新闻数据【Kaggle数据】
- NIPS会议文章信息数据(1987-2016)【Kaggle数据】
- 2016年美国总统选举辩论数据【Kaggle数据】
社会数据
- 希拉里邮件门泄露邮件
- 波士顿 Airbnb 公开数据【Kaggle数据】
- 世界各国经济发展数据【Kaagle数据】
- 世界大学排名芝加哥犯罪数据(2001-2017)【Kaagle数据】
- 世界范围显著地震数据(1965-2016)【Kaagle数据】
- 美国婴儿姓名数据【Kaagle数据】
- 全世界鲨鱼袭击人类数据【Kaagle数据】
- 1908年以来空难数据【Kaagle数据】
- 2016年美国总统大选数据【Kaagle数据】
- 2013年美国社区统计数据【Kaagle数据】
- 欧洲足球运动员赛事表现数据【Kaagle数据】
- 美国环境污染数据【Kaagle数据】
- 美国H1-B签证申请数据【Kaggle数据】
- IMDB五千部电影数据【Kaggle数据】
- 2015年航班延误和取消数据【Kaggle数据】
- 凶杀案报告数据【Kaggle数据】
- 人力资源分析数据【Kaggle数据】
- 某人基因序列数据【Kaggle数据】
- 美国费城犯罪数据【Kaggle数据】
- 安然公司邮件数据【Kaggle数据】
- 历史棒球数据【Kaggle数据】
- 美联航 Twitter 用户评论数据【Kaggle数据】
- 波士顿 Airbnb 公开数据【Kaggle数据】
https://github.com/awesomedata/awesome-public-datasets
Table of Contents 目录
- Agriculture 农业
- Architecture 建筑学
- Biology 生物学
- Chemistry 化学
- Climate+Weather 气候+天气
- ComplexNetworks 复杂网络
- ComputerNetworks 计算机网络
- CyberSecurity 网络安全
- DataChallenges 数据挑战
- EarthScience 地球科学
- Economics 经济学
- Education 教育
- Energy 活力
- Entertainment 娱乐
- Finance 金融
- GIS 地理信息系统
- Government 政府
- Healthcare 卫生保健
- ImageProcessing 图像处理
- MachineLearning 机器学习
- Museums 博物馆
- NaturalLanguage 自然语言
- Neuroscience 神经科学
- Physics 物理
- ProstateCancer 前列腺癌
- Psychology+Cognition 心理学+认知
- PublicDomains 公共领域
- SearchEngines 搜索引擎
- SocialNetworks 社交网络
- SocialSciences 社会科学
- Software 软件
- Sports 运动的
- TimeSeries 时间序列
- Transportation 运输
- eSports 电子竞技
- [Complementary Collections
补充系列](https://github.com/awesomedata/awesome-public-datasets#complementary-collections)
Agriculture 农业
[](https://github.com/awesomedata/awesome-public-datasets#agriculture)
The global dataset of historical yields for major crops 1981–2016 - The Global Dataset of [...] [Meta]
主要农作物历史产量的全球数据集 1981–2016 - [...] [ Meta] 的全球数据集
Hyperspectral benchmark dataset on soil moisture - This dataset was measured in a five-day [...] [Meta]
土壤湿度高光谱基准数据集 - 该数据集是在五天内测量的 [...] [ Meta]
Lemons quality control dataset - Lemon dataset has been prepared to investigate the [...] [Meta]
柠檬质量控制数据集 - 柠檬数据集已准备好调查 [...] [ Meta]
Optimized Soil Adjusted Vegetation Index - The IDB is a tool for working with remote sensing [...] [Meta]
优化土壤调整植被指数 - IDB 是一种用于遥感工作的工具 [...] [ Meta]
U.S. Department of Agriculture's Nutrient Database [Meta]
农业部养分数据库 [元]
U.S. Department of Agriculture's PLANTS Database - The Complete PLANTS Checklist is nearly 7 [...] [Meta]
美国农业部植物数据库 - 完整的植物清单近 7 [...] [ Meta]
Architecture 建筑学
[](https://github.com/awesomedata/awesome-public-datasets#architecture)
Swiss Apartment Models - This dataset contains detailed data on 42,207 apartments (242,257 [...] [Meta]
瑞士公寓模型 - 该数据集包含 42,207 套公寓的详细数据(242,257 [...] [元]
Biology 生物学
[](https://github.com/awesomedata/awesome-public-datasets#biology)
1000 Genomes - The 1000 Genomes Project ran between 2008 and 2015, creating the largest [...] [Meta]
1000 个基因组 - 1000 个基因组项目于 2008 年至 2015 年间运行,创建了最大的 [...] [ Meta]
ANHIR - Automatic Non-rigid Histological Image Registration (ANHIR) consists of 2D [...] [Meta]
ANHIR - 自动非刚性组织学图像配准 (ANHIR) 由 2D [...] [ Meta] 组成
American Gut (Microbiome Project) - The American Gut project is the largest crowdsourced [...] [Meta]
American Gut(微生物组项目)- American Gut 项目是最大的众包 [...] [ Meta]
BCNB - There are WSIs of 1058 patients, part of tumor regions are annotated in WSIs. Except [...] [Meta]
BCNB - 有1058名患者的WSI,WSI中注释了部分肿瘤区域。除了[...] [元]
Broad Bioimage Benchmark Collection (BBBC) - The Broad Bioimage Benchmark Collection (BBBC) [...] [Meta]
广泛的生物图像基准集合 (BBBC) - 广泛的生物图像基准集合 (BBBC) [...] [ 元]
Broad Cancer Cell Line Encyclopedia (CCLE) [Meta]
广泛的癌细胞系百科全书 (CCLE) [ Meta]
CIMA - CIMA dataset includes images of 2D histological microscopy tissue slices. [Meta]
CIMA - CIMA 数据集包括 2D 组织学显微镜组织切片的图像。 [元]
Cell Image Library - This library is a public and easily accessible resource database of [...] [Meta]
细胞图像库 - 该库是一个公共且易于访问的资源数据库 [...] [ Meta]
Complete Genomics Public Data - A diverse data set of whole human genomes are freely [...] [Meta]
完整的基因组学公共数据 - 整个人类基因组的多样化数据集是免费的 [...] [ Meta]
CytoImageNet - A large-scale dataset of microscopy images. Contains 890,737 total grayscale [...] [Meta]
CytoImageNet - 大型显微镜图像数据集。包含 890,737 总灰度 [...] [ Meta]
EBI ArrayExpress - ArrayExpress Archive of Functional Genomics Data stores data from high- [...] [Meta]
EBI ArrayExpress - 功能基因组数据的 ArrayExpress 存档存储来自高 [...] [ Meta] 的数据
EBI Protein Data Bank in Europe - The Electron Microscopy Data Bank (EMDB) is a public [...] [Meta]
欧洲 EBI 蛋白质数据库 - 电子显微镜数据库 (EMDB) 是一个公共 [...] [ Meta]
ENCODE project - The Encyclopedia of DNA Elements (ENCODE) Consortium is an ongoing [...] [Meta]
ENCODE 项目 - DNA 元素百科全书 (ENCODE) 联盟是一个正在进行的 [...] [ Meta]
Electron Microscopy Pilot Image Archive (EMPIAR) - EMPIAR, the Electron Microscopy Public [...] [Meta]
电子显微镜试点图像档案 (EMPIAR) - EMPIAR,电子显微镜公共 [...] [ Meta]
Ensembl Genomes [Meta]
整体基因组 [元]
Gene Expression Omnibus (GEO) - GEO is a public functional genomics data repository [...] [Meta]
基因表达综合 (GEO) - GEO 是一个公共功能基因组学数据存储库 [...] [Meta]
Gene Ontology (GO) - GO annotation files [Meta]
Gene Ontology (GO) - GO 注释文件 [元]
Global Biotic Interactions (GloBI) [Meta]
全球生物相互作用 (GloBI) [元]
Harvard Medical School (HMS) LINCS Project - The Harvard Medical School (HMS) LINCS Center is [...] [Meta]
哈佛医学院 (HMS) LINCS 项目 - 哈佛医学院 (HMS) LINCS 中心 [...] [ Meta]
Human Genome Diversity Project - A group of scientists at Stanford University have [...] [Meta]
人类基因组多样性计划 - 斯坦福大学的一组科学家 [...] [ Meta]
Human Microbiome Project (HMP) - The HMP sequenced over 2000 reference genomes isolated from [...] [Meta]
人类微生物组计划 (HMP) - HMP 对从 [...] [ Meta] 中分离出的 2000 多个参考基因组进行了测序
ICOS PSP Benchmark - The ICOS PSP benchmarks repository contains an adjustable real-world [...] [Meta]
ICOS PSP 基准 - ICOS PSP 基准存储库包含一个可调整的现实世界 [...] [ Meta]
International HapMap Project [Meta]
国际单体型图项目 [元]
Journal of Cell Biology DataViewer [Meta]
细胞生物学杂志数据查看器 [元]
KEGG - KEGG is a database resource for understanding high-level functions and utilities of [...] [Meta]
KEGG - KEGG 是一种数据库资源,用于了解 [...] [ Meta] 的高级功能和实用程序
NCBI Proteins [Meta]
NCBI 蛋白质 [ 元 ]
NCBI Taxonomy - The NCBI Taxonomy database is a curated set of names and classifications for [...] [Meta]
NCBI 分类法 - NCBI 分类法数据库是一组精选的名称和分类 [...] [ Meta]
NCI Genomic Data Commons - The GDC Data Portal is a robust data-driven platform that allows [...] [Meta]
NCI 基因组数据共享 - GDC 数据门户是一个强大的数据驱动平台,允许 [...] [ Meta]
NIH Microarray data [Meta]
NIH 微阵列数据 [元]
OpenSNP genotypes data - openSNP allows customers of direct-to-customer genetic tests to [...] [Meta]
OpenSNP 基因型数据 - openSNP 允许直接面向客户的基因测试客户 [...] [ Meta]
Palmer Penguins - The goal of palmerpenguins is to provide a great dataset for data [...] [Meta]
Palmer Penguins - palmerpenguins 的目标是为数据提供一个很好的数据集 [...] [ Meta]
Pathguid - Protein-Protein Interactions Catalog [Meta]
Pathguid - 蛋白质-蛋白质相互作用目录 [ Meta]
Protein Data Bank - This resource is powered by the Protein Data Bank archive-information [...] [Meta]
蛋白质数据库 - 该资源由蛋白质数据库档案信息提供支持 [...] [ Meta]
Psychiatric Genomics Consortium - The purpose of the Psychiatric Genomics Consortium (PGC) is [...] [Meta]
精神病学基因组学联盟 - 精神病学基因组学联盟 (PGC) 的目的是 [...] [ Meta]
PubChem Project - PubChem is the world's largest collection of freely accessible chemical [...] [Meta]
PubChem 项目 - PubChem 是世界上最大的可免费获取的化学品集合 [...] [ Meta]
PubGene (now Coremine Medical) - COREMINE™ is a family of tools developed by the Norwegian [...] [Meta]
PubGene(现为 Coremine Medical)- COREMINE™ 是由挪威 [...] [ Meta] 开发的一系列工具
Sanger Catalogue of Somatic Mutations in Cancer (COSMIC) - COSMIC, the Catalogue Of Somatic [...] [Meta]
桑格癌症体细胞突变目录 (COSMIC) - COSMIC,体细胞突变目录 [...] [ Meta]
Sanger Genomics of Drug Sensitivity in Cancer Project (GDSC) [Meta]
桑格癌症药物敏感性基因组学项目 (GDSC) [ Meta]
Sequence Read Archive(SRA) - The Sequence Read Archive (SRA) stores raw sequence data from [...] [Meta]
序列读取存档 (SRA) - 序列读取存档 (SRA) 存储来自 [...] [ Meta] 的原始序列数据
Serratus - Analysis of 7.1 million RNA/DNA sequencing datasets to discover the total [...] [Meta]
Serratus - 分析 710 万个 RNA/DNA 测序数据集以发现总 [...] [ Meta]
Stanford Microarray Data (Retired NOW) [Meta]
斯坦福微阵列数据(现已退休)[元]
Stowers Institute Original Data Repository [Meta]
斯托尔斯研究所原始数据存储库[元]
Systems Science of Biological Dynamics (SSBD) Database - Systems Science of Biological [...] [Meta]
生物动力学系统科学 (SSBD) 数据库 - 生物系统科学 [...] [ Meta]
The Cancer Genome Atlas (TCGA), available via Broad GDAC [Meta]
癌症基因组图谱 (TCGA),可通过 Broad GDAC [Meta] 获得
The Catalogue of Life - The Catalogue of Life is a quality-assured checklist of more than 1.8 [...] [Meta]
生命目录 - 生命目录是一份有质量保证的清单,包含超过 1.8 [...] [ Meta]
The Personal Genome Project - The Personal Genome Project, initiated in 2005, is a vision and [...] [Meta]
个人基因组计划 - 个人基因组计划于 2005 年启动,是一个愿景和 [...] [ Meta]
UCSC Public Data [Meta]
UCSC 公共数据 [元]
UniGene [Meta]
UniGene [元]
Universal Protein Resource (UnitProt) - The Universal Protein Resource (UniProt) is a [...] [Meta]
通用蛋白质资源 (UnitProt) - 通用蛋白质资源 (UniProt) 是一个 [...] [ Meta]
Rfam - The Rfam database is a collection of RNA families, each represented by multiple [...] [Meta]
Rfam - Rfam 数据库是 RNA 家族的集合,每个家族由多个 [...] [ Meta] 表示
Chemistry 化学
[](https://github.com/awesomedata/awesome-public-datasets#chemistry)
Ionic Liquids Database - ILThermo [Meta]
离子液体数据库 - ILThemo [元]
Climate+Weather 气候+天气
[](https://github.com/awesomedata/awesome-public-datasets#climateweather)
Actuaries Climate Index [Meta]
精算师景气指数 [元]
Australian Weather [Meta]
澳大利亚天气 [元]
Aviation Weather Center - Consistent, timely and accurate weather information for the world [...] [Meta]
航空气象中心 - 为世界提供一致、及时和准确的天气信息 [...] [ Meta]
Brazilian Weather - Historical data (In Portuguese) - Data related to climate and weather [...] [Meta]
巴西天气 - 历史数据(葡萄牙语) - 与气候和天气相关的数据 [...] [ Meta]
Canadian Meteorological Centre [Meta]
加拿大气象中心[元]
Caravan - a dataset for large-sample hydrology - Caravan is an open community dataset of [...] [Meta]
Caravan - 大样本水文学数据集 - Caravan 是 [...] [ Meta] 的开放社区数据集
Climate Data from UEA (updated monthly) [Meta]
来自东英吉利大学的气候数据(每月更新)[元]
Dutch Weather - The KNMI Data Center (KDC) portal provides access to KNMI data on weather, [...] [Meta]
荷兰天气 - KNMI 数据中心 (KDC) 门户提供对 KNMI 天气数据的访问,[...] [ Meta]
European Climate Assessment & Dataset [Meta]
欧洲气候评估和数据集 [元]
German Climate Data Center [Meta]
德国气候数据中心 [元]
Global Climate Data Since 1929 [Meta]
1929 年以来的全球气候数据 [元]
Charting The Global Climate Change News Narrative 2009-2020 - These four datasets represent [...] [Meta]
绘制 2009-2020 年全球气候变化新闻叙述图表 - 这四个数据集代表 [...] [ Meta]
NASA Global Imagery Browse Services [Meta]
NASA 全球图像浏览服务 [元]
NOAA Bering Sea Climate [Meta]
NOAA 白令海气候 [元]
NOAA Climate Datasets [Meta]
NOAA 气候数据集 [元]
NOAA Realtime Weather Models [Meta]
NOAA 实时天气模型 [元]
NOAA SURFRAD Meteorology and Radiation Datasets [Meta]
NOAA SURFRAD 气象和辐射数据集 [元]
Open-Meteo - Open-Source Weather API - Open-source weather API with free access for non- [...] [Meta]
Open-Meteo - 开源天气 API - 开源天气 API,可供非 [...] [ Meta] 免费访问
The World Bank Open Data Resources for Climate Change [Meta]
世界银行气候变化开放数据资源 [元]
UEA Climatic Research Unit [Meta]
UEA气候研究单位[元]
WU Historical Weather Worldwide [Meta]
WU 全球历史天气 [元]
Wahington Post Climate Change - To analyze warming temperatures in the United States, The [...] [Meta]
华盛顿邮报气候变化 - 为了分析美国气温升高,[...] [ Meta]
WorldClim - Global Climate Data [Meta]
WorldClim - 全球气候数据 [元]
ComplexNetworks 复杂网络
[](https://github.com/awesomedata/awesome-public-datasets#complexnetworks)
AMiner Citation Network Dataset [Meta]
AMiner 引文网络数据集 [元]
CrossRef DOI URLs [Meta]
CrossRef DOI URL [元]
DBLP Citation dataset [Meta]
DBLP 引文数据集 [元]
DIMACS Road Networks Collection [Meta]
DIMACS 道路网络集合 [元]
NBER Patent Citations [Meta]
NBER 专利引文 [元]
NIST complex networks data collection [Meta]
NIST 复杂网络数据收集 [ Meta]
Network Repository with Interactive Exploratory Analysis Tools [Meta]
具有交互式探索性分析工具的网络存储库 [元]
Protein-protein interaction network [Meta]
蛋白质-蛋白质相互作用网络 [ Meta]
PyPI and Maven Dependency Network [Meta]
PyPI 和 Maven 依赖网络 [元]
Scopus Citation Database [Meta]
Scopus 引文数据库 [ 元 ]
Small Network Data [Meta]
小型网络数据[元]
Stanford GraphBase [Meta]
斯坦福 GraphBase [元]
Stanford Large Network Dataset Collection [Meta]
斯坦福大学大型网络数据集 [元]
Stanford Longitudinal Network Data Sources [Meta]
斯坦福纵向网络数据源 [元]
The Koblenz Network Collection [Meta]
科布伦茨网络集合 [元]
The Laboratory for Web Algorithmics (UNIMI) [Meta]
网络算法实验室 (UNIMI) [ Meta]
UCI Network Data Repository [Meta]
UCI 网络数据存储库 [元]
UFL sparse matrix collection [Meta]
UFL稀疏矩阵集合[元]
WSU Graph Database [Meta]
WSU 图数据库 [元]
Community Resource for Archiving Wireless Data At Dartmouth - Contains datasets of pcap files [...] [Meta]
达特茅斯归档无线数据的社区资源 - 包含 pcap 文件数据集 [...] [ Meta]
ComputerNetworks 计算机网络
[](https://github.com/awesomedata/awesome-public-datasets#computernetworks)
3.5B Web Pages from CommonCrawl 2012 [Meta]
来自 CommonCrawl 2012 的 3.5B 网页 [元]
53.5B Web clicks of 100K users in Indiana Univ. [Meta]
印第安纳大学 10 万用户的网络点击量为 53.5B [元]
CAIDA Internet Datasets [Meta]
CAIDA 互联网数据集 [元]
CRAWDAD Wireless datasets from Dartmouth Univ. [Meta]
来自达特茅斯大学的 CRAWDAD 无线数据集。 [元]
ClueWeb09 - 1B web pages [Meta]
ClueWeb09 - 1B 网页 [元]
ClueWeb12 - 733M web pages [Meta]
ClueWeb12 - 733M 网页 [元]
CommonCrawl Web Data over 7 years [Meta]
CommonCrawl 网络数据超过 7 年 [元]
Shopper Intent Prediction from Clickstream E‑Commerce Data with Minimal Browsing Information [Meta]
使用最少的浏览信息根据点击流电子商务数据预测购物者意图 [元]
Criteo click-through data [Meta]
Criteo 点击数据 [元]
Internet-Wide Scan Data Repository [Meta]
互联网范围的扫描数据存储库 [元]
MIRAGE-2019 - MIRAGE-2019 is a human-generated dataset for mobile traffic analysis with [...] [Meta]
MIRAGE-2019 - MIRAGE-2019 是一个人工生成的数据集,用于移动流量分析 [...] [ Meta]
OONI: Open Observatory of Network Interference - Internet censorship data [Meta]
OONI:网络干扰开放观察站 - 互联网审查数据 [元]
Open Mobile Data by MobiPerf [Meta]
通过 MobiPerf 打开移动数据 [元]
The Peer-to-Peer Trace Archive - Real-world measurements play a key role in studying the [...] [Meta]
点对点跟踪存档 - 现实世界的测量在研究 [...] [ Meta] 中发挥着关键作用
Rapid7 Sonar Internet Scans [Meta]
Rapid7 声纳互联网扫描 [元]
UCSD Network Telescope, IPv4 /8 net [Meta]
UCSD 网络望远镜,IPv4 /8 网络 [元]
CyberSecurity 网络安全
[](https://github.com/awesomedata/awesome-public-datasets#cybersecurity)
CCCS-CIC-AndMal-2020 - The dataset includes 200K benign and 200K malware samples totalling to [...] [Meta]
CCCS-CIC-AndMal-2020 - 该数据集包括 200K 个良性样本和 200K 个恶意样本,总计 [...] [ Meta]
Traffic and Log Data Captured During a Cyber Defense Exercise - This dataset was acquired [...] [Meta]
在网络防御演习期间捕获的流量和日志数据 - 该数据集是获取的 [...] [ Meta]
DataChallenges 数据挑战
[](https://github.com/awesomedata/awesome-public-datasets#datachallenges)
AIcrowd Competitions [Meta]
AIcrowd 竞赛 [元]
Bruteforce Database [Meta]
暴力数据库 [元]
Challenges in Machine Learning [Meta]
机器学习的挑战 [元]
CrowdANALYTIX dataX [Meta]
CrowdANALYTIX dataX [元]
D4D Challenge of Orange [Meta]
橙色的D4D挑战[元]
DrivenData Competitions for Social Good [Meta]
社会公益驱动数据竞赛 [元]
ICWSM Data Challenge (since 2009) [Meta]
ICWSM 数据挑战(自 2009 年起)[元]
KDD Cup by Tencent 2012 [Meta]
腾讯 KDD 杯 2012 [ Meta]
Kaggle Competition Data [Meta]
Kaggle 竞赛数据 [元]
Localytics Data Visualization Challenge [Meta]
Localytics 数据可视化挑战 [元]
Netflix Prize [Meta]
Netflix 奖 [元]
Space Apps Challenge [Meta]
太空应用挑战 [元]
Telecom Italia Big Data Challenge [Meta]
意大利电信大数据挑战赛 [元]
TravisTorrent Dataset - MSR'2017 Mining Challenge [Meta]
TravisTorrent 数据集 - MSR'2017 挖矿挑战赛 [元]
TunedIT - Data mining & machine learning data sets, algorithms, challenges [Meta]
TunedIT - 数据挖掘和机器学习数据集、算法、挑战 [元]
Yelp Dataset Challenge - The Yelp dataset is a subset of our businesses, reviews, and user [...] [Meta]
Yelp 数据集挑战 - Yelp 数据集是我们的业务、评论和用户的子集[...] [元]
EarthScience 地球科学
[](https://github.com/awesomedata/awesome-public-datasets#earthscience)
38-Cloud (Cloud Detection) - Contains 38 Landsat 8 scene images and their manually extracted [...] [Meta]
38-Cloud(云检测)- 包含 38 张 Landsat 8 场景图像及其手动提取的 [...] [ Meta]
AQUASTAT - Global water resources and uses [Meta]
AQUASTAT - 全球水资源和利用 [ Meta]
BODC - marine data of ~22K vars [Meta]
BODC - 约 22K vars 的海洋数据 [ Meta]
EOSDIS - NASA's earth observing system data [Meta]
EOSDIS - NASA 的地球观测系统数据 [ Meta]
Earth Models [Meta]
地球模型 [元]
Global Wind Atlas - The Global Wind Atlas is a free, web-based application developed to help [...] [Meta]
Global Wind Atlas - Global Wind Atlas 是一款免费的基于 Web 的应用程序,旨在帮助 [...] [ Meta]
Integrated Marine Observing System (IMOS) - roughly 30TB of ocean measurements [Meta]
综合海洋观测系统 (IMOS) - 大约 30TB 的海洋测量结果 [元]
Marinexplore - Open Oceanographic Data [Meta]
Marineexplore - 开放海洋学数据 [ 元 ]
Alabama Real-Time Coastal Observing System [Meta]
阿拉巴马州实时海岸观测系统 [元]
National Estuarine Research Reserves System-Wide Monitoring Program - long-term estuarine [...] [Meta]
国家河口研究保护区全系统监测计划 - 长期河口 [...] [ Meta]
Oil and Gas Authority Open Data - The dataset covers 12,500 offshore wellbores, 5,000 seismic [...] [Meta]
石油和天然气管理局开放数据 - 数据集涵盖 12,500 个海上井眼、5,000 个地震 [...] [ Meta]
Smithsonian Institution Global Volcano and Eruption Database [Meta]
史密森学会全球火山和喷发数据库 [元]
USGS Earthquake Archives [Meta]
美国地质勘探局地震档案 [元]
Wellhead Protection Area (protection zone) prediction using breakthrough curves - This [...] [Meta]
使用突破曲线进行井口保护区(保护区)预测 - 此 [...] [ Meta]
Economics 经济学
[](https://github.com/awesomedata/awesome-public-datasets#economics)
Asian Productivity Organization (APO) - The AEPM provides a graphic dashboard view of [...] [Meta]
亚洲生产力组织 (APO) - AEPM 提供 [...] [ Meta] 的图形仪表板视图
ASEAN Stats - The ASEANstatsDataPortal was first launched in June 2018. The Portal is [...] [Meta]
东盟统计 -ASEANstatsDataPortal 于 2018 年 6 月首次推出。该门户 [...] [ Meta]
American Economic Association (AEA) [Meta]
美国经济协会 (AEA) [元]
Asian KLEMS - Asia KLEMS is an Asian regional research consortium to promote building [...] [Meta]
亚洲 KLEMS - 亚洲 KLEMS 是一个亚洲区域研究联盟,旨在促进建立 [...] [ Meta]
Harvard Atlas of Economic Complexity - A database for people to explore global trade flows [...] [Meta]
哈佛经济复杂性地图集 - 供人们探索全球贸易流动的数据库 [...] [ Meta]
BIS Financial Database - The files contain the same data as in the BIS Statistics Explorer [...] [Meta]
BIS 金融数据库 - 这些文件包含与 BIS 统计资源管理器中相同的数据 [...] [ Meta]
Barro-Lee Education Attainment - Barro-Lee Educational Attainment Data from 1950 to 2010. [...] [Meta]
Barro-Lee 教育程度 - 1950 年至 2010 年的 Barro-Lee 教育程度数据。[...] [ Meta]
CEPII Database - A database of the world economy, through its country and region profiles, in [...] [Meta]
CEPII 数据库 - 世界经济数据库,通过其国家和地区概况,在 [...] [ Meta]
EUKLEMS - EU KLEMS is an industry level, growth and productivity research project. EU KLEMS [...] [Meta]
EUKLEMS - EU KLEMS 是一个行业水平、增长和生产力研究项目。 EU KLEMS [...] [ 元]
Economic Freedom of the World Data [Meta]
世界经济自由度数据 [元]
Historical National Accounts - The datahub on Comparative Historical National Accounts [...] [Meta]
历史国民账户 - 比较历史国民账户数据中心 [...] [ Meta]
Historical MacroEconomic Statistics [Meta]
历史宏观经济统计[元]
INFORUM - Interindustry Forecasting at the University of Maryland [Meta]
INFORUM - 马里兰大学的行业间预测 [ Meta]
DBnomics – the world's economic database - Aggregates hundreds of millions of time series [...] [Meta]
DBnomics – 世界经济数据库 – 聚合数亿个时间序列 [...] [ Meta]
International Trade Statistics [Meta]
国际贸易统计 [元]
Internet Product Code Database [Meta]
互联网产品代码数据库 [元]
Joint External Debt Data Hub [Meta]
联合外债数据中心 [元]
Jon Haveman International Trade Data Links [Meta]
乔恩·哈夫曼国际贸易数据链接 [元]
Latin America KLEMS - LAKLEMS is a technical cooperation project financed by the Inter- [...] [Meta]
拉丁美洲 KLEMS - LAKLEMS 是一个由 Inter-[...] [ Meta] 资助的技术合作项目
Long-Term Productivity Database - The Long-Term Productivity database was created as a [...] [Meta]
长期生产力数据库 - 长期生产力数据库是作为 [...] [ 元] 创建的
Maddison Project Database - The Maddison Project Database provides information on comparative [...] [Meta]
Maddison 项目数据库 - Maddison 项目数据库提供有关比较 [...] [ Meta] 的信息
National Transfer Accounts - The goal of the National Transfer Accounts (NTA) project is to [...] [Meta]
国民转移账户 - 国民转移账户 (NTA) 项目的目标是 [...] [ Meta]
OpenCorporates Database of Companies in the World [Meta]
OpenCorporates 全球公司数据库 [元]
Our World in Data [Meta]
我们的数据世界 [元]
Penn World Table - PWT version 10.0 is a database with information on relative levels of [...] [Meta]
Penn 世界表 - PWT 版本 10.0 是一个数据库,其中包含 [...] [ Meta] 相对水平的信息
SciencesPo World Trade Gravity Datasets [Meta]
SciencesPo 世界贸易重力数据集 [元]
The Atlas of Economic Complexity [Meta]
经济复杂性图集 [元]
The Center for International Data [Meta]
国际数据中心 [元]
The Observatory of Economic Complexity [Meta]
经济复杂性观察站 [Meta]
UN Commodity Trade Statistics [Meta]
联合国商品贸易统计 [元]
UN Human Development Reports [Meta]
联合国人类发展报告 [元]
World Input-Output Database - World Input-Output Tables and underlying data, covering 43 [...] [Meta]
世界投入产出数据库 - 世界投入产出表及基础数据,涵盖43个[...] [元]
World KLEMS - Analytical KLEMS-type data sets for a broad set of countries around the world. [...] [Meta]
World KLEMS - 适用于世界各地众多国家/地区的分析 KLEMS 类型数据集。 [...] [元]
Education 教育
[](https://github.com/awesomedata/awesome-public-datasets#education)
College Scorecard Data [Meta]
大学记分卡数据 [元]
New York State Education Department Data - The New York State Education Department (NYSED) is [...] [Meta]
纽约州教育部数据 - 纽约州教育部 (NYSED) [...] [元]
Program for International Student Assessement (PISA) - Contains 15-year-old students' [...] [Meta]
国际学生评估计划 (PISA) - 包含 15 岁学生的 [...] [ Meta]
Student Data from Free Code Camp [Meta]
来自免费编程营的学生数据 [元]
Energy 活力
[](https://github.com/awesomedata/awesome-public-datasets#energy)
AMPds - The Almanac of Minutely Power dataset [Meta]
AMPds - 分钟功率年鉴数据集 [元]
BLUEd - Building-Level fUlly labeled Electricity Disaggregation dataset [Meta]
BLUEd - 建筑级完全标记的电力分解数据集 [元]
COMBED [Meta]
COMBED [元]
DBFC - Direct Borohydride Fuel Cell (DBFC) Dataset [Meta]
DBFC - 直接硼氢化物燃料电池 (DBFC) 数据集 [元]
DEL - Domestic Electrical Load study datsets for South Africa (1994 - 2014) [Meta]
DEL - 南非国内电力负荷研究数据集(1994 - 2014)[元]
ECO - The ECO data set is a comprehensive data set for non-intrusive load monitoring and [...] [Meta]
ECO - ECO 数据集是一个用于非侵入式负载监控和 [...] [ Meta] 的综合数据集
EIA [Meta]
这个[元]
Global Power Plant Database - The Global Power Plant Database is a comprehensive, open source [...] [Meta]
全球发电厂数据库 - 全球发电厂数据库是一个全面的开源 [...] [ Meta]
HES - Household Electricity Study, UK [Meta]
HES - 英国家庭电力研究 [ Meta]
HFED [Meta]
HFED [元]
MORED: a Moroccan Buildings’ Electricity Consumption Dataset - Since spring of 2019, a data [...] [Meta]
更多:摩洛哥建筑用电量数据集 - 自 2019 年春季以来,数据 [...] [ Meta]
Marktstammdatenregister - The German Marktstammdatenregister (MaStR) is a database of all [...] [Meta]
Marktstammdatenregister - 德国 Marktstammdatenregister (MaStR) 是所有 [...] [ Meta] 的数据库
PEM1 - Proton Exchange Membrane (PEM) Fuel Cell Dataset [Meta]
PEM1 - 质子交换膜 (PEM) 燃料电池数据集 [元]
PLAID - The Plug Load Appliance Identification Dataset [Meta]
PLAID - 即插即用设备识别数据集 [元]
The Public Utility Data Liberation Project (PUDL) - PUDL makes US energy data easier to [...] [Meta]
公共事业数据解放项目 (PUDL) - PUDL 使美国能源数据更容易 [...] [ Meta]
REDD [Meta]
REDD [元]
SYND - A synthetic energy dataset for non-intrusive load monitoring - With SynD, we present a [...] [Meta]
SYND - 用于非侵入式负载监控的合成能源数据集 - 通过 SynD,我们提出了 [...] [ Meta]
Smart Meter Data Portal - The Smart Meter Data Portal is part of the National Science [...] [Meta]
智能电表数据门户 - 智能电表数据门户是国家科学 [...] [ Meta] 的一部分
Tracebase [Meta]
跟踪库 [元]
Ukraine Energy Centre Datasets [Meta]
乌克兰能源中心数据集 [元]
UK-DALE - UK Domestic Appliance-Level Electricity [Meta]
UK-DALE - 英国家用电器级电力 [元]
WHITED [Meta]
白色 [元]
iAWE [Meta]
iAWE [元]
Entertainment 娱乐
[](https://github.com/awesomedata/awesome-public-datasets#entertainment)
Top Streamers on Twitch - This contains data of Top 1000 Streamers from past year. [Meta]
Twitch 上的热门主播 - 这包含去年前 1000 位主播的数据。 [元]
Finance 金融
[](https://github.com/awesomedata/awesome-public-datasets#finance)
BIS Statistics - BIS statistics, compiled in cooperation with central banks and other [...] [Meta]
BIS 统计数据 - BIS 统计数据,与中央银行和其他 [...] [ Meta] 合作编制
Blockmodo Coin Registry - A registry of JSON formatted information files that is primarily [...] [Meta]
Blockmodo Coin 注册表 - JSON 格式信息文件的注册表,主要是 [...] [ Meta]
CBOE Futures Exchange [Meta]
CBOE 期货交易所 [元]
Complete FAANG Stock data - This data set contains all the stock data of FAANG companies from [...] [Meta]
完整的 FAANG 股票数据 - 该数据集包含来自 [...] [ Meta] 的 FAANG 公司的所有股票数据
Google Finance [Meta]
Google 财经 [元]
Google Trends [Meta]
Google 趋势 [元]
NASDAQ [Meta]
纳斯达克 [元]
NYSE Market Data [Meta]
纽约证券交易所市场数据 [元]
OANDA [Meta]
OANDA [元]
OSU Financial data [Meta]
OSU 财务数据 [元]
Quandl [Meta]
Quandl [元]
SEC EDGAR - EDGAR, the Electronic Data Gathering, Analysis, and Retrieval system, is the [...] [Meta]
SEC EDGAR - EDGAR,电子数据收集、分析和检索系统,是 [...] [ Meta]
St Louis Federal [Meta]
圣路易斯联邦 [元]
Yahoo Finance [Meta]
雅虎财经 [元]
GIS 地理信息系统
[](https://github.com/awesomedata/awesome-public-datasets#gis)
Awesome 3D Semantic City Models - Collection of open 3D semantic city and region models. [Meta]
Awesome 3D Semantic City Models - 开放 3D 语义城市和区域模型的集合。 [元]
ArcGIS Open Data portal [Meta]
ArcGIS 开放数据门户 [元]
Cambridge, MA, US, GIS data on GitHub [Meta]
美国马萨诸塞州剑桥,GitHub 上的 GIS 数据 [元]
Database of all continents, countries, States/Subdivisions/Provinces and Cities - Database [...] [Meta]
所有大陆、国家、州/分区/省和城市的数据库 - 数据库 [...] [ Meta]
Factual Global Location Data [Meta]
事实全球位置数据 [元]
IEEE Geoscience and Remote Sensing Society DASE Website [Meta]
IEEE地球科学与遥感协会DASE网站[元]
Geo Maps - High Quality GeoJSON maps programmatically generated [Meta]
地理地图 - 以编程方式生成的高质量 GeoJSON 地图 [元]
Geo Spatial Data from ASU [Meta]
来自 ASU 的地理空间数据 [元]
Geo Wiki Project - Citizen-driven Environmental Monitoring [Meta]
Geo Wiki 项目 - 公民驱动的环境监测 [元]
GeoFabrik - OSM data extracted to a variety of formats and areas [Meta]
GeoFabrik - 提取到各种格式和区域的 OSM 数据 [元]
GeoNames Worldwide [Meta]
全球地名 [元]
Global Administrative Areas Database (GADM) - Geospatial data organized by country. Includes [...] [Meta]
全球行政区域数据库 (GADM) - 按国家/地区组织的地理空间数据。包括 [...] [ 元 ]
Homeland Infrastructure Foundation-Level Data [Meta]
国土基础设施基础级数据 [元]
Landsat 8 on AWS [Meta]
AWS 上的 Landsat 8 [元]
List of all countries in all languages [Meta]
所有语言的所有国家/地区列表 [元]
National Weather Service GIS Data Portal [Meta]
国家气象局 GIS 数据门户 [元]
Natural Earth - vectors and rasters of the world [Meta]
自然地球 - 世界矢量和栅格 [ Meta]
OpenAddresses [Meta]
OpenAddresses [元]
OpenStreetMap (OSM) [Meta]
OpenStreetMap (OSM) [元]
Pleiades - Gazetteer and graph of ancient places [Meta]
Pleiades - 地名词典和古代地点图 [ Meta]
Reverse Geocoder using OSM data [Meta]
使用 OSM 数据进行反向地理编码 [元]
Robin Wilson - Free GIS Datasets [Meta]
Robin Wilson - 免费 GIS 数据集 [元]
Shadow Accrual Maps - The repository contains the accumulated shadow information for New York [...] [Meta]
Shadow Accrual Maps - 存储库包含纽约累积的影子信息 [...] [ Meta]
TIGER/Line - U.S. boundaries and roads [Meta]
TIGER/Line - 美国边界和道路[元]
TZ Timezones shapefile [Meta]
TZ 时区 shapefile [ 元]
TwoFishes - Foursquare's coarse geocoder [Meta]
TwoFishes - Foursquare 的粗略地理编码器 [ Meta]
UN Environmental Data [Meta]
联合国环境数据 [元]
World boundaries from the U.S. Department of State [Meta]
美国国务院的世界边界 [元]
World countries in multiple formats [Meta]
多种格式的世界国家 [ Meta]
Government 政府
[](https://github.com/awesomedata/awesome-public-datasets#government)
Alberta, Province of Canada [Meta]
加拿大艾伯塔省 [元]
Antwerp, Belgium [Meta]
比利时安特卫普 [元]
Argentina (non official) [Meta]
阿根廷(非官方)[元]
Datos Argentina - Portal de datos abiertos de la República Argentina. Encontrá datos públicos [...] [Meta]
数据阿根廷 - 阿根廷共和国的开放数据门户。查找公共数据 [...] [ 元]
Austin, TX, US [Meta]
美国德克萨斯州奥斯汀 [元]
Australia (abs.gov.au) [Meta]
澳大利亚 (abs.gov.au) [元]
Australia (data.gov.au) [Meta]
澳大利亚 (data.gov.au) [元]
Austria (data.gv.at) [Meta]
奥地利 (data.gv.at) [元]
Baton Rouge, LA, US [Meta]
美国路易斯安那州巴吞鲁日 [元]
Beersheba, Israel - Open Data Portal (Smart7 OpenData) [Meta]
以色列贝尔谢巴 - 开放数据门户 (Smart7 OpenData) [元]
Belgium [Meta]
比利时 [元]
City of Berkeley Open Data [Meta]
伯克利市开放数据 [元]
Brazil [Meta]
巴西 [元]
Buenos Aires, Argentina [Meta]
阿根廷布宜诺斯艾利斯 [元]
Calgary, AB, Canada [Meta]
加拿大艾伯塔省卡尔加里 [元]
Cambridge, MA, US [Meta]
美国马萨诸塞州剑桥 [元]
Canada [Meta]
加拿大 [元]
Chicago [Meta]
芝加哥 [元]
Chile [Meta]
智利 [元]
China [Meta]
中国[元]
Dallas Open Data [Meta]
达拉斯开放数据 [元]
DataBC - data from the Province of British Columbia [Meta]
DataBC - 来自不列颠哥伦比亚省的数据 [元]
Debt to the Penny - The Debt to the Penny dataset provides information about the total [...] [Meta]
Debt to the Penny - Debt to the Penny 数据集提供了有关总 [...] [ Meta] 的信息
Denver Open Data [Meta]
丹佛开放数据 [元]
Durham, NC Open Data [Meta]
北卡罗来纳州达勒姆开放数据 [元]
Edmonton, AB, Canada [Meta]
加拿大艾伯塔省埃德蒙顿 [元]
England LGInform [Meta]
英格兰 LGInform [ 元]
EuroStat [Meta]
欧盟统计局 [元]
EveryPolitician - Ongoing project collating and sharing data on every politician. [Meta]
EveryPolitician - 正在进行的项目,整理和共享每个政治家的数据。 [元]
Federal Committee on Statistical Methodology (FCSM) (formerly FedStats) [Meta]
联邦统计方法委员会 (FCSM)(前身为 FedStats)[元]
Finland [Meta]
芬兰 [元]
France [Meta]
法国[元]
Fredericton, NB, Canada [Meta]
加拿大 NB 弗雷德里克顿 [元]
Gatineau, QC, Canada [Meta]
加蒂诺,QC,加拿大 [元]
Germany [Meta]
德国 [元]
Ghent, Belgium [Meta]
比利时根特 [元]
Glasgow, Scotland, UK [Meta]
英国苏格兰格拉斯哥 [元]
Greece [Meta]
希腊 [元]
Guardian world governments [Meta]
守护世界政府 [ Meta]
Halifax, NS, Canada [Meta]
加拿大新斯科舍省哈利法克斯 [元]
Helsinki Region, Finland [Meta]
芬兰赫尔辛基地区 [元]
Hong Kong, China [Meta]
中国香港 [元]
Houston, TX, US [Meta]
美国德克萨斯州休斯顿 [元]
Indian Government Data [Meta]
印度政府数据 [元]
Indonesian Data Portal [Meta]
印度尼西亚数据门户 [元]
Iowa - Welcome to the State of Iowa's data portal. Please explore data about Iowa and your [...] [Meta]
爱荷华州 - 欢迎来到爱荷华州的数据门户。请探索有关爱荷华州和您的[...] [元]的数据
Ireland's Open Data Portal [Meta]
爱尔兰的开放数据门户 [元]
Israel's Open Data Portal [Meta]
以色列的开放数据门户 [元]
Istanbul Municipality Open Data Portal [Meta]
伊斯坦布尔市开放数据门户 [元]
Italy - Il Portale dati.gov.it è il catalogo nazionale dei metadati relativi ai dati [...] [Meta]
意大利 - Dati.gov.it 门户是与数据相关的元数据国家目录 [...] [ Meta]
Jail deaths in America - The U.S. government does not release jail by jail mortality data, [...] [Meta]
美国监狱死亡人数 - 美国政府不会根据监狱死亡率数据发布监狱情况,[...] [ Meta]
Japan [Meta]
日本 [元]
Laval, QC, Canada [Meta]
加拿大魁北克省拉瓦尔 [元]
Lexington, KY [Meta]
肯塔基州列克星敦 [元]
London Datastore, UK [Meta]
英国伦敦数据存储 [元]
London, ON, Canada [Meta]
加拿大安大略省伦敦 [元]
Los Angeles Open Data [Meta]
洛杉矶开放数据 [元]
Luxembourg - Luxembourgish Open Data Portal [Meta]
卢森堡 - 卢森堡开放数据门户 [元]
MassGIS, Massachusetts, U.S. [Meta]
MassGIS,美国马萨诸塞州 [元]
Metropolitan Transportation Commission (MTC), California, US [Meta]
美国加利福尼亚州大都会交通委员会 (MTC) [元]
Mexico [Meta]
墨西哥 [元]
Mississauga, ON, Canada [Meta]
加拿大安大略省米西索加 [元]
Moldova [Meta]
摩尔多瓦 [元]
Moncton, NB, Canada [Meta]
加拿大新布伦特省蒙克顿 [元]
Montreal, QC, Canada [Meta]
加拿大魁北克省蒙特利尔 [元]
Mountain View, California, US (GIS) [Meta]
美国加利福尼亚州山景城 (GIS) [元]
NYC Open Data [Meta]
纽约市开放数据 [元]
NYC betanyc [Meta]
纽约 betanyc [元]
Netherlands [Meta]
荷兰 [元]
New York Department of Sanitation Monthly Tonnage - DSNY Monthly Tonnage Data provides [...] [Meta]
纽约卫生局每月吨位 - DSNY 每月吨位数据提供 [...] [ 元]
New Zealand [Meta]
新西兰[元]
OECD [Meta]
经合组织 [元]
Oakland, California, US [Meta]
美国加利福尼亚州奥克兰 [元]
Oklahoma [Meta]
俄克拉荷马州 [元]
Open Data for Africa [Meta]
非洲开放数据 [元]
Open Government Data (OGD) Platform India [Meta]
印度开放政府数据 (OGD) 平台 [元]
OpenDataSoft's list of 1,600 open data [Meta]
OpenDataSoft 的 1,600 个开放数据列表 [ Meta]
Oregon [Meta]
俄勒冈州 [元]
Ottawa, ON, Canada [Meta]
加拿大安大略省渥太华 [元]
Palo Alto, California, US [Meta]
美国加利福尼亚州帕洛阿尔托 [元]
OpenDataPhilly - OpenDataPhilly is a catalog of open data in the Philadelphia region. In [...] [Meta]
OpenDataPhilly - OpenDataPhilly 是费城地区开放数据的目录。在[...][元]
Portland, Oregon [Meta]
俄勒冈州波特兰 [元]
Portugal - Pordata organization [Meta]
葡萄牙 - Pordata 组织 [ Meta]
Puerto Rico Government [Meta]
波多黎各政府 [元]
Quebec City, QC, Canada [Meta]
加拿大魁北克市 [元]
Quebec Province of Canada [Meta]
加拿大魁北克省 [元]
Regina SK, Canada [Meta]
Regina SK,加拿大 [元]
Rio de Janeiro, Brazil [Meta]
巴西里约热内卢 [元]
Romania [Meta]
罗马尼亚 [元]
Russia [Meta]
俄罗斯 [元]
San Diego, CA [Meta]
加利福尼亚州圣地亚哥 [元]
San Antonio, TX - Community Information Now - CI:Now is a nonprofit serving Bexar (San [...] [Meta]
德克萨斯州圣安东尼奥 - Community Information Now - CI:Now 是一家为 Bexar 服务的非营利组织(圣 [...] [ Meta]
San Francisco Data sets [Meta]
旧金山数据集 [元]
San Jose, California, US [Meta]
美国加利福尼亚州圣何塞 [元]
San Mateo County, California, US [Meta]
美国加利福尼亚州圣马特奥县 [元]
Saskatchewan, Province of Canada [Meta]
加拿大萨斯喀彻温省 [元]
Seattle [Meta]
西雅图 [元]
Singapore Government Data [Meta]
新加坡政府数据 [元]
South Africa Trade Statistics [Meta]
南非贸易统计 [元]
South Africa [Meta]
南非 [元]
State of Utah, US [Meta]
美国犹他州 [元]
Switzerland [Meta]
瑞士 [元]
Taiwan gov [Meta]
台湾政府[元]
Taiwan [Meta]
台湾[元]
Tel-Aviv Open Data [Meta]
特拉维夫开放数据 [元]
Texas Open Data [Meta]
德克萨斯州开放数据 [元]
The World Bank [Meta]
世界银行 [元]
Toronto, ON, Canada [Meta]
加拿大安大略省多伦多 [元]
Tunisia [Meta]
突尼斯[元]
U.K. Government Data [Meta]
英国政府数据 [元]
U.S. American Community Survey [Meta]
美国社区调查[元]
U.S. CDC Public Health datasets [Meta]
美国 CDC 公共卫生数据集 [元]
U.S. Census Bureau [Meta]
美国人口普查局 [元]
U.S. Department of Housing and Urban Development (HUD) [Meta]
美国住房和城市发展部 (HUD) [元]
U.S. Federal Government Agencies [Meta]
美国联邦政府机构 [元]
U.S. Federal Government Data Catalog [Meta]
美国联邦政府数据目录 [元]
U.S. Food and Drug Administration (FDA) [Meta]
美国食品和药物管理局 (FDA) [元]
U.S. National Center for Education Statistics (NCES) [Meta]
美国国家教育统计中心 (NCES) [元]
U.S. Open Government [Meta]
美国开放政府 [元]
UK 2011 Census Open Atlas Project [Meta]
英国 2011 年人口普查开放地图集项目 [ 元]
US Counties - This is a repository of various data, broken down by US county. While most of [...] [Meta]
美国县 - 这是各种数据的存储库,按美国县细分。虽然大多数[...] [元]
U.S. Patent and Trademark Office (USPTO) Bulk Data Products [Meta]
美国专利商标局 (USPTO) 批量数据产品 [元]
Uganda Bureau of Statistics [Meta]
乌干达统计局 [元]
Ukraine [Meta]
乌克兰 [元]
United Nations [Meta]
联合国 [元]
Uruguay [Meta]
乌拉圭 [元]
Valley Transportation Authority (VTA), California, US [Meta]
美国加利福尼亚州山谷交通管理局 (VTA) [元]
Vancouver, BC Open Data Catalog [Meta]
温哥华,不列颠哥伦比亚省开放数据目录 [元]
Victoria, BC, Canada [Meta]
加拿大不列颠哥伦比亚省维多利亚 [元]
Vienna, Austria [Meta]
奥地利维也纳 [元]
Statistics from the General Statistics Office of Vietnam - Data in different categories are [...] [Meta]
越南统计总局的统计数据 - 不同类别的数据[...] [元]
U.S. Congressional Research Service (CRS) Reports [Meta]
美国国会研究服务 (CRS) 报告 [元]
Healthcare 卫生保健
[](https://github.com/awesomedata/awesome-public-datasets#healthcare)
AWS COVID-19 Datasets - We're working with organizations who make COVID-19-related data [...] [Meta]
AWS COVID-19 数据集 - 我们正在与制作 COVID-19 相关数据的组织合作 [...] [ Meta]
COVID-19 Case Surveillance Public Use Data - The COVID-19 case surveillance system database [...] [Meta]
COVID-19 病例监测公共使用数据 - COVID-19 病例监测系统数据库 [...] [元]
Covid-19 non-processed data of Ecuador - It's a project which provides non-processed datasets [...] [Meta]
厄瓜多尔的 Covid-19 未处理数据 - 这是一个提供未处理数据集的项目 [...] [ Meta]
2019 Novel Coronavirus COVID-19 Data Repository by Johns Hopkins CSSE - This is the data [...] [Meta]
约翰霍普金斯大学 CSSE 的 2019 年新型冠状病毒 COVID-19 数据存储库 - 这是数据 [...] [ Meta]
Coronavirus (Covid-19) Data in the United States - The New York Times is releasing a series [...] [Meta]
美国冠状病毒 (Covid-19) 数据 - 《纽约时报》正在发布一系列 [...] [ Meta]
COVID-19 Reported Patient Impact and Hospital Capacity by Facility - The following dataset [...] [Meta]
COVID-19 按设施报告的患者影响和医院容量 - 以下数据集 [...] [ 元]
Composition of Foods Raw, Processed, Prepared USDA National Nutrient Database for Standard [...] [Meta]
生食、加工食品、预制食品的成分 美国农业部国家营养数据库标准 [...] [ Meta]
The COVID Tracking Project - The COVID Tracking Project collects and publishes the most [...] [Meta]
COVID 跟踪项目 - COVID 跟踪项目收集并发布了最 [...] [ 元]
EHDP Large Health Data Sets [Meta]
EHDP 大型健康数据集 [元]
GDC - GDC supports several cancer genome programs for CCG, TCGA, TARGET etc. [Meta]
GDC - GDC 支持 CCG、TCGA、TARGET 等多种癌症基因组计划。 [ Meta]
Gapminder World demographic databases [Meta]
Gapminder 世界人口数据库 [ Meta]
MeSH, the vocabulary thesaurus used for indexing articles for PubMed [Meta]
MeSH,用于为 PubMed 索引文章的词汇同义词库 [Meta]
MeDAL - A large medical text dataset curated for abbreviation disambiguation - Medical [...] [Meta]
MeDAL - 一个大型医学文本数据集,用于缩写消歧 - 医学 [...] [ Meta]
Medicare Coverage Database (MCD), U.S. [Meta]
医疗保险覆盖数据库 (MCD),美国 [元]
Medicare Data Engine of medicare.gov Data [Meta]
medicare.gov 数据的医疗保险数据引擎 [元]
Medicare Data File [Meta]
医疗保险数据文件 [元]
Nightingale Open Science [Meta]
南丁格尔开放科学 [元]
Number of Ebola Cases and Deaths in Affected Countries (2014) [Meta]
受影响国家的埃博拉病例和死亡人数(2014 年)[元]
Open-ODS (structure of the UK NHS) [Meta]
Open-ODS(英国 NHS 的结构)[元]
OpenPaymentsData, Healthcare financial relationship data [Meta]
OpenPaymentsData,医疗保健财务关系数据 [元]
PhysioBank Databases - A large and growing archive of physiological data. [Meta]
PhysioBank 数据库 - 一个庞大且不断增长的生理数据档案。 [元]
The Cancer Imaging Archive (TCIA) [Meta]
癌症影像档案 (TCIA) [ Meta]
The Cancer Genome Atlas project (TCGA) [Meta]
癌症基因组图谱计划 (TCGA) [ Meta]
World Health Organization Global Health Observatory [Meta]
世界卫生组织全球卫生观察站[元]
Yahoo Knowledge Graph COVID-19 Datasets - The Yahoo Knowledge Graph team at Verizon Media is [...] [Meta]
雅虎知识图谱 COVID-19 数据集 - Verizon Media 的雅虎知识图谱团队 [...] [ Meta]
Informatics for Integrating Biology and the Bedside [Meta]
整合生物学和床边的信息学 [元]
ImageProcessing 图像处理
[](https://github.com/awesomedata/awesome-public-datasets#imageprocessing)
10k US Adult Faces Database [Meta]
10k 美国成人面孔数据库 [ Meta]
2GB of Photos of Cats [Meta]
2GB 猫的照片 [元]
Audience Unfiltered faces for gender and age classification [Meta]
受众未过滤的性别和年龄分类面孔 [元]
Affective Image Classification [Meta]
情感图像分类[元]
Airborne Object Detection and Tracking - The Airborne Object Tracking (AOT) dataset is a [...] [Meta]
机载物体检测和跟踪 - 机载物体跟踪 (AOT) 数据集是一个 [...] [ Meta]
Animals with attributes [Meta]
具有属性 [元] 的动物
CADDY Underwater Stereo-Vision Dataset of divers' hand gestures - Contains 10K stereo pair [...] [Meta]
CADDY 潜水员手势水下立体视觉数据集 - 包含 10K 立体对 [...] [ Meta]
Cytology Dataset – CCAgT: Images of Cervical Cells with AgNOR Stain Technique - Contains 9339 [...] [Meta]
细胞学数据集 – CCAgT:采用 AgNOR 染色技术的宫颈细胞图像 - 包含 9339 [...] [ Meta]
Caltech Pedestrian Detection Benchmark [Meta]
加州理工学院行人检测基准 [元]
Chars74K dataset - Character Recognition in Natural Images (both English and Kannada are available) [Meta]
Chars74K数据集-自然图像中的字符识别(英语和卡纳达语均可用)[Meta]
Cube++ - 4890 raw 18-megapixel images, each containing a SpyderCube color target in their [...] [Meta]
Cube++ - 4890 个原始 18 兆像素图像,每个图像在其 [...] [ Meta] 中包含一个 SpyderCube 颜色目标
Densely Annotated Video Driving Data Set - This data set consists of 28 video sequences of [...] [Meta]
密集注释视频驾驶数据集 - 该数据集由 28 个视频序列组成 [...] [ Meta]
Danbooru Tagged Anime Illustration Dataset - A large-scale anime image database with 3.33m+ [...] [Meta]
Danbooru Tagged 动漫插画数据集 - 拥有 3.33m+ 的大型动漫图像数据库 [...] [ Meta]
DukeMTMC Data Set - DukeMTMC aims to accelerate advances in multi-target multi-camera [...] [Meta]
DukeMTMC 数据集 - DukeMTMC 旨在加速多目标多相机领域的进步 [...] [ Meta]
ETH Entomological Collection (ETHEC) Fine Grained Butterfly (Lepidoptra) Images [Meta]
ETH 昆虫学收藏 (ETHEC) 细粒蝴蝶 (鳞翅类) 图像 [元]
Face Recognition Benchmark [Meta]
人脸识别基准[元]
Flickr: 32 Class Brand Logos [Meta]
Flickr:32 类品牌徽标 [元]
GDXray - X-ray images for X-ray testing and Computer Vision [Meta]
GDXray - 用于 X 射线测试和计算机视觉的 X 射线图像 [元]
HumanEva Dataset - The HumanEva-I dataset contains 7 calibrated video sequences (4 grayscale [...] [Meta]
HumanEva 数据集 - HumanEva-I 数据集包含 7 个校准视频序列(4 个灰度 [...] [ Meta]
ImageNet (in WordNet hierarchy) [Meta]
ImageNet(在 WordNet 层次结构中)[元]
Indoor Scene Recognition [Meta]
室内场景识别[元]
International Affective Picture System, UFL [Meta]
国际情感图片系统,UFL [元]
KITTI Vision Benchmark Suite [Meta]
KITTI 视觉基准套件 [元]
Labeled Information Library of Alexandria - Biology and Conservation - Contains over 10 [...] [Meta]
亚历山大标记信息图书馆 - 生物学和保护 - 包含超过 10 [...] [ 元]
MNIST database of handwritten digits, near 1 million examples [Meta]
MNIST 手写数字数据库,近 100 万个示例 [ Meta]
Multi-View Region of Interest Prediction Dataset for Autonomous Driving - Contains 16 driving [...] [Meta]
自动驾驶多视图感兴趣区域预测数据集 - 包含 16 个驾驶 [...] [ Meta]
Massive Visual Memory Stimuli, MIT [Meta]
大量视觉记忆刺激,麻省理工学院 [元]
Newspaper Navigator - This dataset consists of extracted visual content for 16,358,041 [...] [Meta]
Newspaper Navigator - 该数据集包含 16,358,041 个提取的视觉内容 [...] [ Meta]
Open Images From Google - Pictures with segmentation masks for 2.8 million object instances [...] [Meta]
Open Images From Google - 带有 280 万个对象实例分割掩码的图片 [...] [ Meta]
RuFa - Contains images of text written in one of two Arabic fonts (Ruqaa and Nastaliq [...] [Meta]
RuFa - 包含以两种阿拉伯字体之一编写的文本图像(Ruqaa 和 Nastaliq [...] [ Meta]
SUN database, MIT [Meta]
SUN 数据库,MIT [元]
SVIRO Synthetic Vehicle Interior Rear Seat Occupancy - 25.000 synthetic scenery's across ten [...] [Meta]
SVIRO 合成车辆内部后座占用 - 10 个区域中的 25.000 个合成场景 [...] [ Meta]
Several Shape-from-Silhouette Datasets [Meta]
几个形状轮廓数据集 [元]
Stanford Dogs Dataset [Meta]
斯坦福狗数据集 [元]
The Action Similarity Labeling (ASLAN) Challenge [Meta]
动作相似性标签 (ASLAN) 挑战 [元]
The Oxford-IIIT Pet Dataset [Meta]
Oxford-IIIT 宠物数据集 [元]
Violent-Flows - Crowd Violence / Non-violence Database and benchmark [Meta]
Violent-Flows - 人群暴力/非暴力数据库和基准 [ Meta]
Visual genome [Meta]
视觉基因组 [元]
YouTube Faces Database [Meta]
YouTube 面孔数据库 [元]
MachineLearning 机器学习
[](https://github.com/awesomedata/awesome-public-datasets#machinelearning)
All-Age-Faces Dataset - Contains 13'322 Asian face images distributed across all ages (from 2 [...] [Meta]
All-Age-Faces Dataset - 包含分布在所有年龄段的 13'322 个亚洲人脸图像(来自 2 [...] [ Meta]
Audi Autonomous Driving Dataset - We have published the Audi Autonomous Driving Dataset [...] [Meta]
奥迪自动驾驶数据集 - 我们发布了奥迪自动驾驶数据集 [...] [ Meta]
B3FD - Facial age (and gender) estimation dataset with 375k images - The B3FD dataset is a [...] [Meta]
B3FD - 包含 375k 图像的面部年龄(和性别)估计数据集 - B3FD 数据集是一个 [...] [ Meta]
Context-aware data sets from five domains [Meta]
来自五个领域的上下文感知数据集 [元]
Delve Datasets for classification and regression [Meta]
深入研究数据集进行分类和回归 [元]
Discogs Monthly Data [Meta]
Discogs 每月数据 [元]
Fluorescent Neuronal Cells - By releasing this dataset, we aim at providing a new testbed for [...] [Meta]
荧光神经元细胞 - 通过发布此数据集,我们的目标是为 [...] [ Meta] 提供一个新的测试平台
Free Music Archive [Meta]
免费音乐档案 [元]
IMDb Database [Meta]
IMDb 数据库 [ 元 ]
Iranis - A Large-scale Dataset of Farsi/Arabic License Plate Characters [Meta]
Iranis - 波斯语/阿拉伯语车牌字符的大型数据集 [元]
Keel Repository for classification, regression and time series [Meta]
用于分类、回归和时间序列的 Keel 存储库 [Meta]
LLVIP - This dataset contains 30976 images, or 15488 pairs, most of which were taken at very [...] [Meta]
LLVIP - 该数据集包含 30976 张图像或 15488 对,其中大部分是在非常[...] [元]
Labeled Faces in the Wild (LFW) [Meta]
野外标记面孔 (LFW) [ 元]
Lending Club Loan Data [Meta]
Lending Club 贷款数据 [元]
Machine Learning Data Set Repository [Meta]
机器学习数据集存储库 [元]
Million Song Dataset [Meta]
百万歌曲数据集 [元]
More Song Datasets [Meta]
更多歌曲数据集[元]
MovieLens Data Sets [Meta]
MovieLens 数据集 [ 元]
New Yorker caption contest ratings [Meta]
《纽约客》字幕竞赛评分 [元]
RDataMining - "R and Data Mining" ebook data [Meta]
RDataMining - 《R 与数据挖掘》电子书数据 [元]
Registered Meteorites on Earth [Meta]
地球上已登记的陨石 [元]
Restaurants Health Score Data in San Francisco [Meta]
旧金山的餐厅健康评分数据 [元]
TikTok Dataset - More than 300 dance videos that capture a single person performing dance [...] [Meta]
TikTok 数据集 - 超过 300 个捕捉单个人表演舞蹈的舞蹈视频 [...] [ Meta]
UCI Machine Learning Repository [Meta]
UCI 机器学习存储库 [元]
Yahoo! Ratings and Classification Data [Meta]
雅虎!评级和分类数据 [元]
YouTube-BoundingBoxes [Meta]
YouTube-BoundingBoxes [元]
Youtube 8m [Meta]
YouTube 8m [元]
eBay Online Auctions (2012) [Meta]
eBay 在线拍卖 (2012) [ 元]
Museums 博物馆
[](https://github.com/awesomedata/awesome-public-datasets#museums)
Canada Science and Technology Museums Corporation's Open Data [Meta]
加拿大科技博物馆公司的开放数据 [元]
Cooper-Hewitt's Collection Database [Meta]
Cooper-Hewitt 的馆藏数据库 [ 元 ]
Metropolitan Museum of Art Collection API [Meta]
大都会艺术博物馆藏品 API [元]
Minneapolis Institute of Arts metadata [Meta]
明尼阿波利斯艺术学院元数据 [ 元 ]
Natural History Museum (London) Data Portal [Meta]
自然历史博物馆(伦敦)数据门户 [元]
Rijksmuseum Historical Art Collection [Meta]
国家博物馆历史艺术收藏[元]
Tate Collection metadata [Meta]
泰特美术馆元数据 [元]
The Getty vocabularies [Meta]
Getty 词汇表 [元]
NaturalLanguage 自然语言
[](https://github.com/awesomedata/awesome-public-datasets#naturallanguage)
Automatic Keyphrase Extraction [Meta]
自动关键短语提取 [元]
The Big Bad NLP Database [Meta]
大坏NLP数据库[元]
Blizzard Challenge Speech - The speech + text data comes from professional audiobooks [...] [Meta]
暴雪挑战赛演讲 - 演讲+文字数据来自专业有声读物 [...] [ Meta]
Blogger Corpus [Meta]
博客语料库 [元]
CLiPS Stylometry Investigation Corpus [Meta]
CLiPS 风格测量调查语料库 [元]
ClueWeb09 FACC [Meta]
ClueWeb09 FACC [元]
ClueWeb12 FACC [Meta]
ClueWeb12 FACC [元]
DBpedia - Structured data from Wikipedia [Meta]
DBpedia - 来自维基百科的结构化数据 [元]
Dirty Words - With millions of images in our library and billions of user-submitted keywords, [...] [Meta]
脏话 - 我们的库中有数百万张图像和数十亿个用户提交的关键字,[...] [ Meta]
Flickr Personal Taxonomies [Meta]
Flickr 个人分类法 [元]
Freebase of people, places, and things [Meta]
人物、地点和事物的自由库 [元]
German Political Speeches Corpus - Collection of political speeches from the German [...] [Meta]
德国政治演讲语料库 - 德国政治演讲集 [...] [ Meta]
Google Books Ngrams (2.2TB) [Meta]
Google 图书 Ngram (2.2TB) [ 元]
Google MC-AFP - Generated based on the public available Gigaword dataset using Paragraph Vectors [Meta]
Google MC-AFP - 使用段落向量 [元] 根据公共可用的 Gigaword 数据集生成
Google Web 5gram (1TB, 2006) [Meta]
Google Web 5gram(1TB,2006)[元]
Gutenberg eBooks List [Meta]
古腾堡电子书列表 [元]
Hansards text chunks of Canadian Parliament [Meta]
加拿大议会的议事录文本块 [元]
LJ Speech - Speech dataset consisting of 13,100 short audio clips of a single speaker reading [...] [Meta]
LJ Speech - 语音数据集,由单个说话者朗读的 13,100 个短音频片段组成 [...] [ Meta]
M-AILabs Speech - The M-AILABS Speech Dataset is the first large dataset that we are [...] [Meta]
M-AILabs 语音 - M-AILABS 语音数据集是我们[...] [元]的第一个大型数据集
Microsoft MAchine Reading COmprehension Dataset (or MS MARCO) [Meta]
Microsoft 机器阅读理解数据集(或 MS MARCO)[元]
Machine Comprehension Test (MCTest) of text from Microsoft Research [Meta]
Microsoft Research [元] 文本的机器理解测试 (MCTest)
Machine Translation of European languages [Meta]
欧洲语言机器翻译[元]
Making Sense of Microposts 2013 - Concept Extraction [Meta]
理解微帖子 2013 - 概念提取 [元]
Making Sense of Microposts 2016 - Named Entity rEcognition and Linking [Meta]
理解微博 2016 - 命名实体识别和链接 [元]
Multi-Domain Sentiment Dataset (version 2.0) [Meta]
多领域情感数据集(2.0版)[元]
No Language Left Behind (NLLB - 200vo) - Dataset based on Meta's metadata for mined bitext. [...] [Meta]
No Language Left Behind (NLLB - 200vo) - 基于 Meta 元数据的挖掘双文本数据集。 [...] [元]
Noisy speech database for training speech enhancement algorithms and TTS models - Clean and [...] [Meta]
用于训练语音增强算法和 TTS 模型的噪声语音数据库 - Clean 和 [...] [ Meta]
Open Multilingual Wordnet [Meta]
打开多语言 Wordnet [元]
POS/NER/Chunk annotated data [Meta]
POS/NER/Chunk 注释数据 [ Meta]
Personae Corpus [Meta]
人物语料库 [元]
SMS Spam Collection in English [Meta]
英文短信垃圾邮件收集 [元]
SaudiNewsNet Collection of Saudi Newspaper Articles (Arabic, 30K articles) [Meta]
SaudiNewsNet 沙特报纸文章合集(阿拉伯文,3 万篇文章)[ 元]
Stanford Question Answering Dataset (SQuAD) [Meta]
斯坦福问答数据集 (SQuAD) [元]
USENET postings corpus of 2005~2011 [Meta]
2005~2011年USENET帖子语料库[元]
Universal Dependencies [Meta]
通用依赖关系 [元]
Webhose - News/Blogs in multiple languages [Meta]
Webhose - 多种语言的新闻/博客 [元]
Wikidata - Wikipedia databases [Meta]
Wikidata - 维基百科数据库 [元]
Wikipedia Links data - 40 Million Entities in Context [Meta]
维基百科链接数据 - 上下文中的 4000 万个实体 [元]
WordNet databases and tools [Meta]
WordNet 数据库和工具 [ Meta]
Wordbank - Open, de-identified database of vocabulary development from 84,138 children and [...] [Meta]
Wordbank - 开放的、去识别化的词汇发展数据库,包含 84,138 名儿童和 [...] [ Meta]
WorldTree Corpus of Explanation Graphs for Elementary Science Questions - a corpus of [...] [Meta]
WorldTree 基础科学问题解释图语料库 - [...] [ Meta] 的语料库
Neuroscience 神经科学
[](https://github.com/awesomedata/awesome-public-datasets#neuroscience)
Allen Institute Datasets [Meta]
艾伦研究所数据集 [元]
Brain Catalogue [Meta]
大脑目录 [元]
Brainomics [Meta]
脑组学 [元]
CodeNeuro Datasets [Meta]
CodeNeuro 数据集 [元]
Collaborative Research in Computational Neuroscience (CRCNS) [Meta]
计算神经科学合作研究 (CRCNS) [元]
FCP-INDI [Meta]
FCP-INDI [元]
Human Connectome Project [Meta]
人类连接组项目 [元]
NDAR [Meta]
NDAR [元]
NIMH Data Archive [Meta]
NIMH 数据存档 [ 元 ]
NeuroData [Meta]
神经数据 [元]
NeuroMorpho - NeuroMorpho.Org is a centrally curated inventory of digitally reconstructed [...] [Meta]
NeuroMorpho - NeuroMorpho.Org 是一个集中策划的数字重建清单 [...] [ Meta]
Neuroelectro [Meta]
神经电[元]
OASIS [Meta]
绿洲 [元]
OpenNEURO [Meta]
OpenNEURO [元]
OpenfMRI [Meta]
OpenfMRI [元]
Study Forrest [Meta]
研究福雷斯特 [元]
The Nencki-Symfonia EEG/ERP dataset - A high-density electroencephalography (EEG) dataset [...] [Meta]
Nencki-Symfonia EEG/ERP 数据集 - 高密度脑电图 (EEG) 数据集 [...] [ Meta]
Physics 物理
[](https://github.com/awesomedata/awesome-public-datasets#physics)
CERN Open Data Portal [Meta]
CERN 开放数据门户 [元]
Crystallography Open Database [Meta]
晶体学开放数据库[元]
IceCube - South Pole Neutrino Observatory [Meta]
IceCube - 南极中微子观测站 [元]
Ligo Open Science Center (LOSC) - Gravitational wave data from the LIGO Hanford and [...] [Meta]
Ligo 开放科学中心 (LOSC) - 来自 LIGO Hanford 和 [...] [ Meta] 的引力波数据
NASA Exoplanet Archive [Meta]
NASA 系外行星档案 [元]
NSSDC (NASA) data of 550 space spacecraft [Meta]
NSSDC(NASA)550艘太空飞船数据[元]
Quantum simulations of an electron in a two dimensional potential well - The data was [...] [Meta]
二维势阱中电子的量子模拟 - 数据是 [...] [ Meta]
Sloan Digital Sky Survey (SDSS) - Mapping the Universe [Meta]
斯隆数字巡天 (SDSS) - 绘制宇宙图 [元]
ProstateCancer 前列腺癌
[](https://github.com/awesomedata/awesome-public-datasets#prostatecancer)
EOPC-DE-Early-Onset-Prostate-Cancer-Germany - Early Onset Prostate Cancer - Germany. [...] [Meta]
EOPC-DE-早发前列腺癌-德国 - 早发前列腺癌 - 德国。 [...] [元]
GENIE - Data from the Genomics Evidence Neoplasia Information Exchange (GENIE) project of the [...] [Meta]
GENIE - 来自 [...] [Meta] 的基因组学证据肿瘤信息交换 (GENIE) 项目的数据
Genomic-Hallmarks-Prostate-Adenocarcinoma-CPC-GENE - Comprehensive genomic profiling of 477 [...] [Meta]
基因组标志-前列腺-腺癌-CPC-GENE - 477 [...] [Meta] 的全面基因组分析
MSK-IMPACT-Clinical-Sequencing-Cohort-MSKCC-Prostate-Cancer - Targeted sequencing of clinical [...] [Meta]
MSK-IMPACT-Clinical-Sequencing-Cohort-MSKCC-Prostate-Cancer - 临床 [...] [Meta] 的靶向测序
Metastatic-Prostate-Adenocarcinoma-MCTP - Comprehensive profiling of 61 prostate cancer [...] [Meta]
转移性前列腺腺癌 - MCTP - 61 种前列腺癌的综合分析 [...] [ Meta]
Metastatic-Prostate-Cancer-SU2CPCF-Dream-Team - Comprehensive analysis of 150 metastatic [...] [Meta]
转移性前列腺癌-SU2CPCF-Dream-Team - 150个转移性癌症的综合分析[...] [ Meta]
NPCR-2001-2015 - Database from CDC's National Program of Cancer Registries (NPCR). The [...] [Meta]
NPCR-2001-2015 - 来自 CDC 国家癌症登记计划 (NPCR) 的数据库。 [...] [元]
NPCR-2005-2015 - Database from CDC's National Program of Cancer Registries (NPCR). The [...] [Meta]
NPCR-2005-2015 - 来自 CDC 国家癌症登记计划 (NPCR) 的数据库。 [...] [元]
NaF-Prostate - NaF Prostate is a collection of F-18 NaF positron emission tomography/computed [...] [Meta]
NaF-Prostate - NaF Prostate 是 F-18 NaF 正电子发射断层扫描/计算 [...] [ Meta] 的集合
Neuroendocrine-Prostate-Cancer - Whole exome and RNA Seq data of castration resistant [...] [Meta]
神经内分泌前列腺癌 - 去势抵抗的全外显子组和 RNA 测序数据 [...] [ Meta]
PLCO-Prostate-Diagnostic-Procedures - The Prostate Diagnostic Procedures dataset (95,837 [...] [Meta]
PLCO-Prostate-Diagnostic-Procedures - 前列腺诊断程序数据集(95,837 [...] [元]
PLCO-Prostate-Medical-Complications - The Prostate Medical Complications dataset (3,350 [...] [Meta]
PLCO-Prostate-Medical-Complications - 前列腺医疗并发症数据集(3,350 [...] [元]
PLCO-Prostate-Screening-Abnormalities - The Prostate Screening Abnormalities dataset (10,527 [...] [Meta]
PLCO-Prostate-Screening-Abnormalities - 前列腺筛查异常数据集(10,527 [...] [ Meta]
PLCO-Prostate-Screening - The Prostate Screening dataset (177,315 records, 35,875 subjects, [...] [Meta]
PLCO-Prostate-Screening - 前列腺筛查数据集(177,315 条记录,35,875 名受试者,[...] [ Meta]
PLCO-Prostate-Treatments - The Prostate Treatments dataset (13,409 records, 7,614 subjects, [...] [Meta]
PLCO-Prostate-Treatments - 前列腺治疗数据集(13,409 条记录,7,614 名受试者,[...] [ Meta]
PLCO-Prostate - The Prostate dataset is a comprehensive dataset that contains nearly all the [...] [Meta]
PLCO-Prostate - 前列腺数据集是一个综合数据集,包含几乎所有[...] [元]
PRAD-CA-Prostate-Adenocarcinoma-Canada - Prostate Adenocarcinoma - Canada. Collected by the [...] [Meta]
PRAD-CA-前列腺腺癌-加拿大 - 前列腺腺癌 - 加拿大。由 [...] [ Meta] 收集
PRAD-FR-Prostate-Adenocarcinoma-France - Prostate Adenocarcinoma - France. Collected by ten [...] [Meta]
PRAD-FR-前列腺腺癌-法国 - 前列腺腺癌 - 法国。由十个[...] [元]收集
PRAD-UK-Prostate-Adenocarcinoma-United-Kingdom - Prostate Adenocarcinoma - United Kingdom. [...] [Meta]
PRAD-UK-前列腺腺癌-英国 - 前列腺腺癌 - 英国。 [...] [元]
PROSTATEx-Challenge - Retrospective set of prostate MR studies. All studies included [...] [Meta]
PROSTATEx-Challenge - 前列腺 MR 研究的回顾性研究。所有研究都包括 [...] [ Meta]
Prostate-3T - The Prostate-3T project provided imaging data to TCIA as part of an ISBI [...] [Meta]
Prostate-3T - Prostate-3T 项目向 TCIA 提供成像数据,作为 ISBI [...] [ Meta] 的一部分
Prostate-Adenocarcinoma-Broad-Cornell-2012 - Comprehensive profiling of 112 prostate cancer [...] [Meta]
前列腺-腺癌-Broad-Cornell-2012 - 112 种前列腺癌的综合分析 [...] [ Meta]
Prostate-Adenocarcinoma-Broad-Cornell-2013 - Comprehensive profiling of 57 prostate cancer [...] [Meta]
前列腺腺癌-Broad-Cornell-2013 - 57 种前列腺癌的综合分析 [...] [ Meta]
Prostate-Adenocarcinoma-CNA-study-MSKCC - Copy-number profiling of 103 primary prostate [...] [Meta]
前列腺-腺癌-CNA-研究-MSKCC - 103 个原发性前列腺的拷贝数分析 [...] [ Meta]
Prostate-Adenocarcinoma-Fred-Hutchinson-CRC - Comprehensive profiling of prostate cancer [...] [Meta]
前列腺-腺癌-Fred-Hutchinson-CRC - 前列腺癌的综合分析 [...] [ Meta]
Prostate Adenocarcinoma (MSKCC/DFCI) - Whole Exome Sequencing of 1013 prostate cancer samples. [Meta]
前列腺腺癌 (MSKCC/DFCI) - 1013 个前列腺癌样本的全外显子组测序。 [元]
Prostate-Adenocarcinoma-MSKCC - MSKCC Prostate Oncogenome Project. 181 primary, 37 metastatic [...] [Meta]
前列腺-腺癌-MSKCC - MSKCC 前列腺肿瘤基因组项目。 181 个原发性,37 个转移性 [...] [ Meta]
Prostate-Adenocarcinoma-Organoids-MSKCC - Exome profiling of prostate cancer samples and [...] [Meta]
前列腺-腺癌-类器官-MSKCC - 前列腺癌样本的外显子组分析和 [...] [ Meta]
Prostate-Adenocarcinoma-Sun-Lab - Whole-genome and Transcriptome Sequencing of 65 Prostate [...] [Meta]
前列腺腺癌-Sun-Lab - 65 个前列腺的全基因组和转录组测序 [...] [ Meta]
Prostate-Adenocarcinoma-TCGA-PanCancer-Atlas - Comprehensive TCGA PanCanAtlas data from 11k [...] [Meta]
前列腺-腺癌-TCGA-PanCancer-Atlas - 来自 11k [...] [Meta] 的综合 TCGA PanCanAtlas 数据
Prostate-Adenocarcinoma-TCGA - Integrated profiling of 333 primary prostate adenocarcinoma samples. [Meta]
前列腺-腺癌-TCGA - 333 个原发性前列腺腺癌样本的综合分析。 [元]
Prostate-Diagnosis - PCa T1- and T2-weighted magnetic resonance images (MRIs) were acquired [...] [Meta]
前列腺诊断 - 采集 PCa T1 和 T2 加权磁共振图像 (MRI) [...] [ Meta]
Prostate-Fused-MRI-Pathology - The Prostate Fused-MRI-Pathology collection is a combination [...] [Meta]
前列腺融合 MRI 病理学 - 前列腺融合 MRI 病理学集合是 [...] [ Meta] 的组合
Prostate-MRI - The Prostate-MRI collection of prostate Magnetic Resonance Images (MRIs) was [...] [Meta]
前列腺 MRI - 前列腺磁共振图像 (MRI) 的前列腺 MRI 集合是 [...] [ Meta]
Prostate-R - The R package 'ElemStatLearn' contains a prostate cancer dataset from Stamey et [...] [Meta]
Prostate-R - R 包“ElemStatLearn”包含来自 Stamey 等人的前列腺癌数据集 [...] [ Meta]
QIN-PROSTATE-Repeatability - The QIN-PROSTATE-Repeatability dataset is a dataset with [...] [Meta]
QIN-PROSTATE-Repeatability - QIN-PROSTATE-Repeatability 数据集是一个具有 [...] [ Meta] 的数据集
QIN-PROSTATE - The QIN PROSTATE collection of the Quantitative Imaging Network (QIN) contains [...] [Meta]
QIN-PROSTATE - 定量成像网络 (QIN) 的 QIN PROSTATE 集合包含 [...] [ Meta]
SEER-YR1973_2015.SEER9 - The SEER November 2017 Research Data files from nine SEER registries [...] [Meta]
SEER-YR1973_2015.SEER9 - 来自九个 SEER 注册机构的 SEER 2017 年 11 月研究数据文件 [...] [ Meta]
SEER-YR1992_2015.SJ_LA_RG_AK - The SEER November 2017 Research Data files from the San Jose- [...] [Meta]
SEER-YR1992_2015.SJ_LA_RG_AK - 来自圣何塞的 SEER 2017 年 11 月研究数据文件 - [...] [ Meta]
SEER-YR2000_2015.CA_KY_LO_NJ_GA - The SEER November 2017 Research Data files from the Greater [...] [Meta]
SEER-YR2000_2015.CA_KY_LO_NJ_GA - 来自大[...] [元]的 SEER 2017 年 11 月研究数据文件
SEER-YR2000_2015.CA_KY_LO_NJ_GA - The July - December 2005 diagnoses for Louisiana from their [...] [Meta]
SEER-YR2000_2015.CA_KY_LO_NJ_GA - 路易斯安那州 2005 年 7 月至 12 月的诊断,来自他们的 [...] [ Meta]
TCGA-PRAD-US - TCGA Prostate Adenocarcinoma (499 samples). [Meta]
TCGA-PRAD-US - TCGA 前列腺腺癌(499 个样本)。 [元]
Psychology+Cognition 心理学+认知
[](https://github.com/awesomedata/awesome-public-datasets#psychologycognition)
OSU Cognitive Modeling Repository Datasets [Meta]
OSU 认知建模存储库数据集 [元]
Open Cognitive Science Data - Publicly available behavioral datasets from across cognitive [...] [Meta]
开放认知科学数据 - 来自认知领域的公开行为数据集 [...] [元]
PublicDomains 公共领域
[](https://github.com/awesomedata/awesome-public-datasets#publicdomains)
Ably Open Realtime Data [Meta]
巧妙地开放实时数据 [元]
Amazon [Meta]
亚马逊 [元]
Archive.org Datasets [Meta]
Archive.org 数据集 [元]
Archive-it from Internet Archive [Meta]
将其从互联网档案馆存档 [元]
CMU JASA data archive [Meta]
CMU 服务数据存档 [ 元 ]
CMU StatLab collections [Meta]
CMU StatLab 集合 [元]
Data.World [Meta]
数据世界 [元]
Data360 [Meta]
Data360 [元]
Enigma Public [Meta]
Enigma 公共 [ 元]
Google [Meta]
谷歌[元]
Grand Comics Database - The Grand Comics Database (GCD) is a nonprofit, internet-based [...] [Meta]
Grand Comics Database - Grand Comics Database (GCD) 是一个非营利性的、基于互联网的 [...] [ Meta]
Infochimps [Meta]
Infochimps [元]
KDNuggets Data Collections [Meta]
KDNuggets 数据收集 [ 元 ]
Microsoft Azure Data Market Free DataSets [Meta]
Microsoft Azure 数据市场免费数据集 [元]
Microsoft Data Science for Research [Meta]
Microsoft 数据科学研究 [元]
Microsoft Research Open Data [Meta]
微软研究院开放数据 [元]
Open Library Data Dumps [Meta]
开放库数据转储 [元]
Reddit Datasets [Meta]
Reddit 数据集 [ 元]
RevolutionAnalytics Collection [Meta]
RevolutionAnalytics 集合 [元]
Sample R data sets [Meta]
示例 R 数据集 [元]
Stack Overflow Annual Developer Survey - Annual developer surverys full data sets from 2011 [...] [Meta]
Stack Overflow 年度开发者调查 - 年度开发者调查 2011 年以来的完整数据集 [...] [ Meta]
StatSci.org [Meta]
StatSci.org [元]
Stats4Stem R data sets (archived) [Meta]
Stats4Stem R 数据集(已存档)[ Meta]
The Washington Post List [Meta]
华盛顿邮报列表 [元]
UCLA SOCR data collection [Meta]
加州大学洛杉矶分校 SOCR 数据收集 [元]
UFO Reports [Meta]
UFO 报告 [元]
Wikileaks 911 pager intercepts [Meta]
维基解密 911 寻呼机拦截 [元]
Yahoo Webscope [Meta]
雅虎 Webscope [元]
SearchEngines 搜索引擎
[](https://github.com/awesomedata/awesome-public-datasets#searchengines)
Academic Torrents of data sharing from UMB [Meta]
UMB 数据共享的学术洪流 [ Meta]
Base dos Dados - Data Basis: Open Data Repository for Brazil [Meta]
Base dos Dados - 数据基础:巴西开放数据存储库 [元]
Datahub.io [Meta]
Datahub.io [元]
Domains Project - Sorted list of Internet domains [Meta]
域名项目 - 互联网域名排序列表 [元]
Harvard Dataverse Network of scientific data [Meta]
科学数据的Harvard Dataverse网络[元]
ICPSR (UMICH) [Meta]
ICPSR (UMICH) [元]
Institute of Education Sciences [Meta]
教育科学研究所[元]
National Technical Reports Library [Meta]
国家技术报告库[元]
Open Data Certificates (beta) [Meta]
开放数据证书(测试版)[元]
OpenDataNetwork - A search engine of all Socrata powered data portals [Meta]
OpenDataNetwork - 所有 Socrata 支持的数据门户的搜索引擎 [ Meta]
Statista.com - statistics and Studies [Meta]
Statista.com - 统计和研究 [元]
Zenodo - An open dependable home for the long-tail of science [Meta]
Zenodo - 科学长尾的开放可靠之家 [ Meta]
SocialNetworks 社交网络
[](https://github.com/awesomedata/awesome-public-datasets#socialnetworks)
2021 Portuguese Elections Twitter Dataset - 57M+ tweets, 1M+ users - This dataset contains [...] [Meta]
2021 年葡萄牙选举 Twitter 数据集 - 5700 万条以上推文,100 万以上用户 - 该数据集包含 [...] [ 元]
72 hours #gamergate Twitter Scrape [Meta]
72 小时 #gamergate Twitter 抓取 [元]
CMU Enron Email of 150 users [Meta]
CMU Enron 150 位用户的电子邮件 [元]
Cheng-Caverlee-Lee September 2009 - January 2010 Twitter Scrape [Meta]
Cheng-Caverlee-Lee 2009 年 9 月 - 2010 年 1 月 Twitter 抓取 [元]
China Biographical Database - The China Biographical Database is a freely accessible [...] [Meta]
中国传记数据库 - 中国传记数据库是一个可免费访问的 [...] [ Meta]
Clubhouse Dataset [Meta]
俱乐部会所数据集 [元]
A Twitter Dataset of 40+ million tweets related to COVID-19 - Due to the relevance of the [...] [Meta]
Twitter 数据集,包含 40 多万条与 COVID-19 相关的推文 - 由于 [...] [ Meta] 的相关性
43k+ Donald Trump Twitter Screenshots - This archive contains screenshots of 43,475 Donald [...] [Meta]
43k+ 唐纳德·特朗普 Twitter 屏幕截图 - 此存档包含 43,475 个唐纳德·特朗普的屏幕截图 [...] [ Meta]
EDRM Enron EMail of 151 users, hosted on S3 [Meta]
151 位用户的 EDRM 安然电子邮件,托管在 S3 上 [元]
Facebook Data Scrape (2005) [Meta]
Facebook 数据抓取 (2005) [ 元]
Facebook Social Connectedness Index - We use an anonymized snapshot of all active Facebook [...] [Meta]
Facebook 社交联系指数 - 我们使用所有活跃 Facebook 的匿名快照 [...] [ Meta]
Facebook Social Networks from LAW (since 2007) [Meta]
LAW 的 Facebook 社交网络(自 2007 年起)[ Meta]
Foursquare from UMN/Sarwat (2013) [Meta]
Foursquare 来自 UMN/Sarwat (2013) [元]
GitHub Collaboration Archive [Meta]
GitHub 协作存档 [元]
Google Scholar citation relations [Meta]
Google Scholar 引用关系 [元]
High-Resolution Contact Networks from Wearable Sensors [Meta]
来自可穿戴传感器的高分辨率接触网络 [元]
Indie Map: social graph and crawl of top IndieWeb sites [Meta]
独立地图:顶级独立网站的社交图和抓取 [元]
Mobile Social Networks from UMASS [Meta]
来自 UMASS 的移动社交网络 [Meta]
Network Twitter Data [Meta]
网络 Twitter 数据 [元]
Reddit Comments [Meta]
Reddit 评论 [元]
Skytrax' Air Travel Reviews Dataset [Meta]
Skytrax 的航空旅行评论数据集 [元]
Social Twitter Data [Meta]
社交 Twitter 数据 [元]
SourceForge.net Research Data [Meta]
SourceForge.net 研究数据 [元]
The Reddit COVID dataset - This dataset attempts to capture the full extent of COVID-19 [...] [Meta]
Reddit COVID 数据集 - 该数据集试图捕获 COVID-19 的全部范围 [...] [ Meta]
Twitch Top Streamer's Data [Meta]
Twitch 顶级主播的数据 [元]
Twitter Data for Online Reputation Management [Meta]
用于在线声誉管理的 Twitter 数据 [元]
Twitter Data for Sentiment Analysis [Meta]
用于情感分析的 Twitter 数据 [元]
Twitter Graph of entire Twitter site [Meta]
整个 Twitter 网站的 Twitter 图 [元]
Twitter Scrape Calufa May 2011 [Meta]
Twitter 抓取 Calufa 2011 年 5 月 [元]
UNIMI/LAW Social Network Datasets [Meta]
UNIMI/LAW 社交网络数据集 [元]
United States Congress Twitter Data - Daily datasets with tweets of 1100+ accounts associated [...] [Meta]
美国国会 Twitter 数据 - 包含 1100 多个关联帐户推文的每日数据集 [...] [ Meta]
Yahoo! Graph and Social Data [Meta]
雅虎!图和社交数据[元]
Youtube Video Social Graph in 2007,2008 [Meta]
2007,2008 年 Youtube 视频社交图谱 [ Meta]
SocialSciences 社会科学
[](https://github.com/awesomedata/awesome-public-datasets#socialsciences)
ACLED (Armed Conflict Location & Event Data Project) [Meta]
ACLED(武装冲突地点和事件数据项目)[元]
Authoritarian Ruling Elites Database - The Authoritarian Ruling Elites Database (ARED) is a [...] [Meta]
威权统治精英数据库 - 威权统治精英数据库 (ARED) 是一个 [...] [ 元]
Canadian Legal Information Institute [Meta]
加拿大法律信息研究所[元]
Center for Systemic Peace Datasets - Conflict Trends, Polities, State Fragility, etc [Meta]
系统和平数据集中心 - 冲突趋势、政治、国家脆弱性等 [元]
Correlates of War Project [Meta]
战争项目的相关性 [元]
Cryptome Conspiracy Theory Items [Meta]
Cryptome 阴谋论物品 [ 元 ]
Datacards [Meta]
数据卡 [元]
European Social Survey [Meta]
欧洲社会调查[元]
FBI Hate Crime 2013 - aggregated data [Meta]
FBI 仇恨犯罪 2013 - 汇总数据 [元]
Fragile States Index [Meta]
脆弱国家指数 [元]
GDELT Global Events Database [Meta]
GDELT 全球事件数据库 [元]
General Social Survey (GSS) since 1972 [Meta]
1972 年以来的综合社会调查 (GSS) [元]
German Social Survey [Meta]
德国社会调查 [元]
Global Religious Futures Project [Meta]
全球宗教未来项目 [元]
Gun Violence Data - A comprehensive, accessible database that contains records of over 260k [...] [Meta]
枪支暴力数据 - 一个全面、可访问的数据库,包含超过 26 万条记录 [...] [ Meta]
Humanitarian Data Exchange [Meta]
人道主义数据交换 [元]
INFORM Index for Risk Management [Meta]
INFORM 风险管理索引 [元]
Institute for Demographic Studies [Meta]
人口研究所[元]
International Networks Archive [Meta]
国际网络档案 [元]
International Social Survey Program ISSP [Meta]
国际社会调查计划 ISSP [元]
International Studies Compendium Project [Meta]
国际研究纲要项目[元]
James McGuire Cross National Data [Meta]
James McGuire 跨国家数据 [元]
MIT Reality Mining Dataset [Meta]
麻省理工学院现实挖掘数据集 [元]
MacroData Guide by Norsk samfunnsvitenskapelig datatjeneste [Meta]
挪威社会科学数据服务的 MacroData 指南 [ Meta]
Mass Mobilization Data Project - The Mass Mobilization (MM) data are an effort to understand [...] [Meta]
大规模动员数据项目 - 大规模动员 (MM) 数据是为了理解 [...] [ Meta]
Microsoft Academic Knowledge Graph - The Microsoft Academic Knowledge Graph is a large RDF [...] [Meta]
Microsoft 学术知识图 - Microsoft 学术知识图是一个大型 RDF [...] [ Meta]
Minnesota Population Center [Meta]
明尼苏达州人口中心 [元]
Notre Dame Global Adaptation Index (ND-GAIN) [Meta]
圣母大学全球适应指数 (ND-GAIN) [元]
Open Crime and Policing Data in England, Wales and Northern Ireland [Meta]
英格兰、威尔士和北爱尔兰的开放犯罪和警务数据 [元]
OpenSanctions - A global database of persons and companies of political, criminal, or [...] [Meta]
OpenSanctions - 政治、犯罪或[...] [元]个人和公司的全球数据库
Paul Hensel General International Data Page [Meta]
Paul Hensel 一般国际数据页面 [元]
PewResearch Internet Survey Project [Meta]
PewResearch 互联网调查项目 [元]
PewResearch Society Data Collection [Meta]
皮尤研究协会数据收集 [元]
Political Polarity Data [Meta]
政治极性数据 [元]
StackExchange Data Explorer [Meta]
StackExchange 数据浏览器 [元]
Terrorism Research and Analysis Consortium [Meta]
恐怖主义研究与分析联盟 [元]
Texas Inmates Executed Since 1984 [Meta]
德克萨斯州囚犯自 1984 年起被处决 [元]
Titanic Survival Data Set [Meta]
泰坦尼克号生存数据集 [元]
UCB's Archive of Social Science Data (D-Lab) [Meta]
UCB 社会科学数据档案 (D-Lab) [ 元]
UCLA Social Sciences Data Archive [Meta]
加州大学洛杉矶分校社会科学数据档案 [元]
UN Civil Society Database [Meta]
联合国民间社会数据库 [元]
UPJOHN for Labor Employment Research [Meta]
UPJOHN 劳工就业研究 [元]
Universities Worldwide [Meta]
全球大学 [元]
Uppsala Conflict Data Program [Meta]
乌普萨拉冲突数据计划 [元]
World Bank Open Data [Meta]
世界银行开放数据 [元]
World Inequality Database - The World Inequality Database (WID.world) aims to provide open [...] [Meta]
世界不平等数据库 - 世界不平等数据库 (WID.world) 旨在提供开放的 [...] [ Meta]
WorldPop project - Worldwide human population distributions [Meta]
WorldPop 项目 - 全球人口分布 [ Meta]
Software 软件
[](https://github.com/awesomedata/awesome-public-datasets#software)
FLOSSmole data about free, libre, and open source software development [Meta]
关于免费、自由和开源软件开发的 FLOSSmole 数据 [元]
GHTorrent - Scalable, queryable, offline mirror of data offered through the GitHub REST API. [Meta]
GHTorrent - 通过 GitHub REST API 提供的可扩展、可查询、离线数据镜像。 [元]
Libraries.io Open Source Repository and Dependency Metadata [Meta]
Libraries.io 开源存储库和依赖元数据 [元]
Public Git Archive - a Big Code dataset for all – dataset of 182,014 top-bookmarked Git [...] [Meta]
公共 Git 档案 - 所有人的大代码数据集 - 182,014 个顶级书签的 Git 数据集 [...] [ Meta]
Code duplicates - 2k Java file and 600 Java function pairs labeled as similar or different by [...] [Meta]
代码重复 - 2k Java 文件和 600 个 Java 函数对,被 [...] [ Meta] 标记为相似或不同
Commit messages - 1.3 billion GitHub commit messages till March 2019 [Meta]
提交消息 - 截至 2019 年 3 月,共有 13 亿条 GitHub 提交消息 [元]
Pull Request review comments - 25.3 million GitHub PR review comments since January 2015 till [...] [Meta]
Pull Request 审核评论 - 自 2015 年 1 月起至 [...] [ Meta] 已有 2530 万条 GitHub PR 审核评论
Source Code Identifiers - 41.7 million distinct splittable identifiers collected from 182,014 [...] [Meta]
源代码标识符 - 从 182,014 个 [...] [ Meta] 中收集了 4170 万个不同的可拆分标识符
Sports 运动的
[](https://github.com/awesomedata/awesome-public-datasets#sports)
American Ninja Warrior Obstacles - Contains every obstacle in the history of American Ninja [...] [Meta]
美国忍者武士障碍 - 包含美国忍者历史上的所有障碍 [...] [ Meta]
Betfair Historical Exchange Data [Meta]
Betfair 历史交易数据 [元]
Cricsheet Matches (cricket) [Meta]
Cricsheet 比赛(板球)[ Meta]
Equity in Athletics - The Equity in Athletics Data Analysis Cutting Tool is brought to you by [...] [Meta]
田径公平性 - 田径公平性数据分析切割工具由 [...] [ Meta] 为您带来
Ergast Formula 1, from 1950 up to date (API) [Meta]
Ergast Formula 1,从 1950 年至今 (API) [ Meta]
Football/Soccer resources (data and APIs) [Meta]
足球资源(数据和 API)[元]
Lahman's Baseball Database [Meta]
拉赫曼的棒球数据库 [元]
NFL play-by-play data - NFL play-by-play data sourced from: [...] [Meta]
NFL 逐场比赛数据 - NFL 逐场比赛数据来源于:[...] [ Meta]
Pinhooker: Thoroughbred Bloodstock Sale Data [Meta]
Pinhooker:纯种纯种马销售数据 [元]
Pro Kabadi season 1 to 7 - Pro Kabadi League is a professional-level Kabaddi league in India. [...] [Meta]
职业卡巴迪赛季 1 至 7 - 职业卡巴迪联赛是印度的职业级别卡巴迪联赛。 [...] [元]
Retrosheet Baseball Statistics [Meta]
回顾表棒球统计数据 [元]
Tennis database of rankings, results, and stats for ATP [Meta]
ATP 排名、结果和统计数据的网球数据库 [元]
Tennis database of rankings, results, and stats for WTA [Meta]
WTA 排名、结果和统计数据的网球数据库 [元]
Transfermarkt Datasets - Clean, structured and automatically updated football (soccer) data [...] [Meta]
Transfermarkt 数据集 - 干净、结构化且自动更新的足球数据 [...] [ Meta]
USA Soccer Teams and Locations - USA soccer teams and locations. MLS, NWSL, and USL [...] [Meta]
美国足球队和地点 - 美国足球队和地点。 MLS、NWSL 和 USL [...] [ 元]
TimeSeries 时间序列
[](https://github.com/awesomedata/awesome-public-datasets#timeseries)
3W dataset - To the best of its authors' knowledge, this is the first realistic and public [...] [Meta]
3W 数据集 - 据作者所知,这是第一个现实且公开的 [...] [ Meta]
Databanks International Cross National Time Series Data Archive [Meta]
国际数据库跨国家时间序列数据档案 [元]
Hard Drive Failure Rates [Meta]
硬盘故障率 [元]
Heart Rate Time Series from MIT [Meta]
麻省理工学院的心率时间序列 [元]
Time Series Data Library (TSDL) from MU [Meta]
来自 MU 的时间序列数据库 (TSDL) [元]
Turing Change Point Dataset - Contains 42 annotated time series collected for the development [...] [Meta]
图灵变化点数据集 - 包含为开发而收集的 42 个带注释的时间序列 [...] [ Meta]
UC Riverside Time Series Dataset [Meta]
加州大学河滨分校时间序列数据集 [元]
Transportation 运输
[](https://github.com/awesomedata/awesome-public-datasets#transportation)
Airlines OD Data 1987-2008 [Meta]
1987-2008 年航空公司 OD 数据 [元]
Ford GoBike Data (formerly Bay Area Bike Share Data) [Meta]
福特 GoBike 数据(以前称为湾区自行车共享数据)[ Meta ]
Bike Share Systems (BSS) collection [Meta]
自行车共享系统 (BSS) 集合 [元]
Dutch Traffic Information [Meta]
荷兰交通信息 [元]
GeoLife GPS Trajectory from Microsoft Research [Meta]
来自微软研究院的 GeoLife GPS 轨迹 [元]
German train system by Deutsche Bahn [Meta]
德国铁路公司的德国火车系统 [ Meta]
Hubway Million Rides in MA [Meta]
Hubway 在马萨诸塞州的百万次骑行 [元]
Melbourne Pedestrian Counting - This dataset contains hourly pedestrian counts since 2009 [...] [Meta]
墨尔本行人计数 - 该数据集包含自 2009 年以来每小时的行人计数 [...] [ Meta]
Montreal BIXI Bike Share [Meta]
蒙特利尔 BIXI 共享单车 [元]
NYC Taxi Trip Data 2009- [Meta]
2009 年纽约市出租车行程数据 - [ 元]
NYC Taxi Trip Data 2013 (FOIA/FOILed) [Meta]
2013 年纽约市出租车行程数据 (FOIA/FOILed) [元]
NYC Uber trip data April 2014 to September 2014 [Meta]
纽约市 Uber 行程数据 2014 年 4 月至 2014 年 9 月 [元]
Open Traffic collection [Meta]
开放流量收集[元]
OpenFlights - airport, airline and route data [Meta]
OpenFlights - 机场、航空公司和航线数据 [元]
Philadelphia Bike Share Stations (JSON) [Meta]
费城自行车共享站 (JSON) [元]
Plane Crash Database, since 1920 [Meta]
飞机失事数据库,自 1920 年以来 [ Meta]
RITA Airline On-Time Performance data [Meta]
RITA 航空公司准点率数据 [元]
RITA/BTS transport data collection (TranStat) [Meta]
RITA/BTS 运输数据收集 (TranStat) [ Meta]
Renfe (Spanish National Railway Network) dataset [Meta]
Renfe(西班牙国家铁路网)数据集 [元]
Toronto Bike Share Stations (JSON and GBFS files) [Meta]
多伦多自行车共享站(JSON 和 GBFS 文件)[ Meta]
Transport for London (TFL) [Meta]
伦敦交通 (TFL) [元]
Travel Tracker Survey (TTS) for Chicago [Meta]
芝加哥旅行追踪调查 (TTS) [元]
U.S. Bureau of Transportation Statistics (BTS) [Meta]
美国交通统计局 (BTS) [元]
U.S. Domestic Flights 1990 to 2009 [Meta]
1990 年至 2009 年美国国内航班 [元]
U.S. Freight Analysis Framework since 2007 [Meta]
2007年以来的美国货运分析框架 [Meta]
U.S. National Highway Traffic Safety Administration - Fatalities since 1975 - Contains CSV [...] [Meta]
美国国家公路交通安全管理局 - 自 1975 年以来的死亡人数 - 包含 CSV [...] [ 元]
eSports 电子竞技
[](https://github.com/awesomedata/awesome-public-datasets#esports)
CS:GO Competitive Matchmaking Data - In this data set we have data about the CSGO matchmaking [...] [Meta]
CS:GO 竞技对接会数据 - 在此数据集中,我们有有关 CSGO 对接会的数据 [...] [ Meta]
FIFA-2021 Complete Player Dataset [Meta]
FIFA-2021 完整球员数据集 [元]
OpenDota data dump [Meta]
OpenDota 数据转储 [元]
[Complementary Collections
补充系列](https://github.com/awesomedata/awesome-public-datasets#id810)
[](https://github.com/awesomedata/awesome-public-datasets#complementary-collections)
- [Data Packaged Core Datasets
数据打包核心数据集](https://github.com/datasets/) - OpenDataMonitor: An overview of available open data resources in Europe
OpenDataMonitor:欧洲可用开放数据资源概述 - Quora: Where can I find large datasets open to the public?
Quora:在哪里可以找到向公众开放的大型数据集? - RS.io: 100+ Interesting Data Sets for Statistics
RS.io:100 多个有趣的统计数据集 - CVonline: Image Databases
CVonline:图像数据库 - InnoTrek: Leveraging open data to understand urban lives
InnoTrek:利用开放数据了解城市生活 - CV Papers: CV Datasets on the web
简历论文:网络上的简历数据集
Special thanks to 特别感谢
Comments | NOTHING