OpenStreetMap is a project aimed squarely at creating and providing free geographic data such as street maps to anyone who wants them. It is a free editable map of the whole world. It is made by people like you.



OpenStreetMap is a project aimed squarely at creating and providing free geographic data such as street maps to anyone who wants them. It is a free editable map of the whole world. It is made by people like you.



BookBrainz is a project to create an online database of information about every single book, magazine, journal and other publication ever written. We make all the data that we collect available to the whole world to consume and use as they see fit.
Platform for community contribution and management of location-based experiences, enabling users to enhance shared place data, reach tiered levels, and support developer tools.




Powering current and next-generation map products by creating reliable, easy-to-use, and interoperable open map data.

Sharing data is hard. Emails have size limits, and setting up servers is too much work. We've designed a distributed system for sharing enormous datasets - for researchers, by researchers. The result is a scalable, secure, and fault-tolerant repository for data, with blazing...
KEEL is an open source (GPLv3) Java software tool to assess evolutionary algorithms for Data Mining problems including regression, classification, clustering, pattern mining and so on. It contains a big collection of classical knowledge extraction algorithms, preprocessing...


SweetData.io is a data marketplace allowing to search, buy, sell and download datasets aimed for machine learning and big data purposes.



Quandl is a search engine for numerical data. The platform offers access to millions of open and free financial, economic, and social datasets, that are indexed from hundreds of sources. It provides all users with options to download the datasets in various formats, or to access...
MetricsBot provides a comprehensive dataset containing domain names and websites. MetricsBot provides an easy and convenient way to view website statistics such as daily traffic, estimated value, Alexa rank, whois information, backlinks and much more.

Ouro is a collaborative web platform for creative problem solvers to share and monetize their work.




Make Lorem ipsum less monotonous
Rich data No need to worry about creating Lorem ipsum data anymore. We have provided articles, user profiles, products, comments and to-do lists for you. Moreover, we will add more datasets in the future.
Use ChatGPT to fill in data No need to w.

Label Studio is an open source data labeling tool. It lets you label data types like audio, text, images, videos, and time series with a simple and straightforward UI and export to various model formats.

Powerdrill AI is an AI SaaS service centered around personal and enterprise datasets—swift insights from your knowledge and data.


DataChain builds a suite of tools for data preprocessing and management, experiment tracking, ML models versioning, and pipeline automation.
Get unique, filtered, reliable geo data sets via natural language. "All power plants in Illinois"? "Schools in San Francisco"? "Bike lanes in Amsterdam"? Type in what you search for & get the corresponding data set in seconds.

Companywell is a B2B lead generation data platform that allows teams to build the perfect outreach list. We combine humans, machine learning and the public internet to compile structured data in an easy to use platform and robust API.



Dataset Search is a search engine for datasets.
With a simple keyword search, users can discover datasets hosted in thousands of repositories across the Web.


Train AI models with your data in minutes, not weeks, and get better performance at lower cost. Integrates with open-source and proprietary foundation models.


