Data mining

  • YabTab

    Free Personal Software as a Service (SaaS) Website

    YabTab automatically converts web pages to tables. There is tonnes of information on web: think of product listing pages, course catalogues, job postings, reports - and all of them are essentially tables. Product listing pages, for example, are tables with one row for each product, columns for product information like name, features, price, etc.
    However all current scraping tools either require extensive configuration to extract such information or are domain specific (yet rarely works). The idea behind YabTab is to build a tool which can auto-extract such "tabular" information from all websites irrespective of their domain or underlying structure and technology. YabTab uses revolutionary Machine Learning techniques to recognize these patterns in any web page, a skill only humans were capable of so far.


    YabTab icon
  • Simplescraper

    Freemium Web Chrome Software as a Service (SaaS) Website

    A web scraper that's fast, intuitive and 100% free to use in the browser. Download data from websites and tables in seconds.


    Simplescraper icon
  • Wintr

    Free Web Website

    Free proxy service and web scraping API that allows you to scrape and parse any webpage's HTML with Cheerio to turn it into a personalized item dataset.


    Wintr icon
  • Linux based web scrapping

  • Textricator

    Free Mac Windows Linux Website

    Textricator is a tool for extracting text from computer-generated PDFs and generating structured data . If you have a bunch of PDFs with the same format (or one big, consistently formatted PDF) and you want to extract the data to CSV or JSON,.


    Textricator icon
  • PHPMediaServer

    Free Linux Self-Hosted Website

    OpenSource Web Media Server to browse and stream any video file format supported by ffmpeg with easy web interface for play on any platform with html5 browser, dnla or kodi plugin.


    PHPMediaServer icon
  • Web based web scrapping

  • Linux based data mining

  • Orange

    Free Mac Windows Linux Website

    Orange is an open-source, cross-platform data mining and machine learning suite. It features visual programming as intuitive means of combining data analysis and interactive visualization methods into powerful workflows. Visual programming enables users who are not programmers to manage, preprocess, explore and model data. With many functionalities aboard, this software can make data mining and machine learning easier for novice and expert users.


    Orange icon

    Free Mac Windows Linux Website

    Knime is a java open-source, cross-platform application which name means "Konstanz Information Miner". It is actually used extensively for data mining, data analysis and optimization. It can be downloaded as the core application itself(Knime Desktop), or the whole SDK which is based on Eclipse Helios.

    The knime software can also work with different kinds of extensions which are embedded into the "/downloads/extensions" tabs of the website.


    KNIME icon
  • Beaker

    Free Mac Windows Linux Website

    The Beaker Notebook is a new open source tool for research and data science. It's advanced UI allows you to focus on your data and your science, instead of getting frustrated by your tool. We designed it to be polyglot from the ground up. That is, a single notebook may contain code from multiple different languages that communicate with one another through a unique feature called autotranslation. You can set a variable in a Python
    cell and then read that variable in a subsequent R cell, and everything just works, magically. Beaker comes with built-in support for Python, R, Javascript, Scala, Groovy, Julia, Clojure, and K.


    Beaker icon
  • SpagoBI

    Free Windows Linux Website

    SpagoBI is the only entirely Open Source Business Intelligence suite. It covers all the analytical areas of Business Intelligence projects, with innovative themes and engines. SpagoBI offers a wide range of analytical tools: Reporting, Multidimensional Analysis (OLAP), Charts, Dashboards, KPI, Interactive Cockpits, Ad-Hoc Reporting, In-memory Analysis, Geographical analysis (GEO/GIS), Free Inquiry (Query by Example), Smart Filter, Data Mining, Real Time Console, Accessible Reporting, Analytical Dossier, Office Automation, ETL,...


    SpagoBI icon
  • WEKA

    Free Mac Windows Linux Website

    Weka is a collection of machine learning algorithms for data mining tasks; with its own GUI.

    (The application is named after a flightless bird of New Zealand that is very inquisitive.)

    The algorithms can either be applied directly to a dataset or called from your own Java code.

    Weka contains tools for data pre-processing, classification, regression, clustering, association rules, and visualization. It is also well-suited for developing new machine learning schemes.


    WEKA icon
  • Stan

    Free Mac Windows Linux Python Stata ... R (programming language) Julia MATLAB Website

    Stan is a probabilistic programming language for data analysis, enabling automatic inference for a large class of statistical models.


    Stan icon
  • dradis

    Freemium Mac Windows Linux BSD Website

    Dradis is an open source framework to enable effective information sharing, specially during security assessments.

    Dradis is a self-contained web application that provides a centralized repository of information to keep track of what has been done so far, and what is still ahead. [screenshots - demo]

    Features include:
    Easy report generation.
    Support for attachments.
    Integration with existing systems and tools through server plugins.
    Platform independent.


    dradis icon
  • sn0int

    Free Mac Windows Linux BSD Website

    sn0int is a semi-automatic OSINT framework and package manager. It was built for IT security professionals and bug hunters to gather intelligence about a given target or about yourself. sn0int is enumerating attack surface by semi-automatically processing public information and mapping the results in a unified format for followup investigations.


    sn0int icon
  • R Caret

    Free Personal Mac Windows Linux R (programming language) Website

    The caret package (short for _C_lassification _A_nd _RE_gression _T_raining) is a set of functions that attempt to streamline the process for creating predictive models. data splitting pre-processing feature selection model tuning using resampling.


  • Pyspread

    Free Mac Windows Linux Website

    Pyspread is a non-traditional spreadsheet application that is based on and written in the programming language Python.

    The goal of pyspread is to be the most pythonic spreadsheet.

    Pyspread expects Python expressions in its grid cells, which makes a spreadsheet specific language obsolete. Each cell returns a Python object that can be accessed from other cells. These objects can represent anything including lists or matrices.


    Pyspread icon
  • DataMelt

    Free Personal Mac Windows Linux Android Website

    DataMelt (or DMelt) is a program for numeric computation, statistics, data analysis and data visualization.
    This multiplatform program is integrated with Jython (Python), Groovy, JRuby, BeanShell on the JAVA platform. DMelt can be used to plot functions and data in 2D and 3D, perform statistical tests, data mining, numeric computations, function minimization, linear algebra, solving systems of linear and differential equations. Linear, non-linear and symbolic regression are also available.


    DataMelt icon
  • ELKI

    Free Mac Windows Linux Website

    ELKI: "Environment for Developing KDD-Applications Supported by Index-Structures" is a development framework for data mining algorithms written in Java. It includes a large variety of popular data mining algorithms, distance functions and index structures.

    Its focus is particularly on clustering and outlier detection methods, in contrast to many other data mining toolkits that focus on classification. Additionally, it includes support for index structures to improve algorithm performance such as R*-Tree and M-Tree.

    The modular architecture is meant to allow adding custom components such as distance functions or algorithms, while being able to reuse the other parts for evaluation.


    ELKI icon
  • Widestage

    Free Mac Windows Linux Self-Hosted Website

    Light weight open source reports tool, that allows users to create their own HTML reports and dashboards just dragging and dropping elements, powered by a semantic layer.


    Widestage icon
  • Web based data mining

  • InfraNodus

    Free Mac Windows Linux Web Website

    InfraNodus can visualize any research notes, ideas, texts, even the Google search results on a certain topic as a text network. The words are the nodes and their co-occurrences are the connections between them. It is also possible to build network graphs using hashtags, which makes it a very quick mind map creation tool.

    The resulting text network can then be used to identify the main topics and how they are connected, to get a good overview and a different perspective on any text. The software will also analyze the gaps in the existing knowledge, to help see how a specific research or topic could be developed further.


    InfraNodus icon
  • Apache Mahout

    Free Linux Web Website

    Apache Mahout is an Apache project to produce free implementations of distributed or otherwise scalable machine learning algorithms on the Hadoop platform. Mahout is a work in progress; the number of implemented algorithms has grown quickly, but there are still various algorithms missing.While Mahout's core algorithms for clustering, classification and batch based collaborative filtering are implemented on top of Apache Hadoop using the map/reduce paradigm, it does not restrict contributions to Hadoop based implementations. Contributions that run on a single node or on a non-Hadoop cluster are also welcomed. For example, the 'Taste' collaborative-filtering recommender component of Mahout was originally a separate project, and can run stand-alone without Hadoop. Integration with initiatives such as the Pregel-like Giraph are actively under discussion.External links EC2 AMI with Hadoop and Mahout Giraph - a Graph processing infrastructure that runs on Hadoop (see Pregel). Pregel - Google's internal graph processing platform, released details in ACM paper.


    Apache Mahout icon
  • ggraptR

    Free Windows Linux Web Self-Hosted Website

    ggraptR is an open source R package providing a GUI for visualization. It is based on principles of visualization analysis by Tamara Munzner, and also acts as a wrapper for functionality implemented in the grammar of graphics for R, ggplot2.


    ggraptR icon
  • Phantombuster

    Freemium Web Website

    Boost your marketing with our cloud APIs. Our bots automate all the major websites (LinkedIn, Twitter, Facebook, Instagram...) Productivity increase guaranteed! Alternatively, you can create your own APIs thanks to our developer platform.


    Phantombuster icon

    Freemium Mac Windows Linux Web Chrome OS ... TIBCO MDM SnowFall Tableau QlikView Website

    REPODS is an online data warehouse service for managing & analyzing data histories in data pods. Data can be imported via various interfaces. IoT devices can also stream data directly to a data pod for cross-analysis with other data warehouse data.


    REPODS icon

Comments on Data mining

Echo echo ... Feels empty in here

Maybe you want to be the first to submit a comment?

Sign up to comment, it's simple!