xenmaster's data-science tools

Data Science is more about learning concepts rather than software. These concepts include statistics, linear algebra, and ab-testing. But the following tools are the most commonly used in this practice.

Alex Ruiz
Alex RuizList by Alex Ruiz, last updated 
Copy a direct link to this comment to your clipboard
  1. Python icon
     Like

    Python is an interpreted, interactive, object-oriented, extensible programming language. It provides an extraordinary combination of clarity and versatility, and is free and comprehensively ported.

    Cost / License

    • Free
    • Open Source

    Application type

    Platforms

    • Mac
    • Windows
    • Linux
    • Symbian S60
    • BSD
    • AROS
    • Haiku
    • AmigaOS
    • OpenSolaris
    • MorphOS
    Python screenshot 1
    Python screenshot 1
    The Python 3.1 interpreter running in a GNOME Terminal
  2. Basic Computer Skills

    Learning to work in the terminal is a good skill to have for anyone working in the computer science field. And no programmer's experience is complete without learning the version control power of Git!

  3. PowerShell (including Windows PowerShell and PowerShell Core) is a task automation and configuration management framework from Microsoft, consisting of a command-line shell and associated scripting language built on the .NET Framework.

    Cost / License

    • Free
    • Open Source

    Application type

    Platforms

    • Mac
    • Windows
    • Linux
    • Snapcraft
    PowerShell screenshot 1
  4. Terminal icon
     Like

    Terminal (also referred to as Terminal.app) is a terminal emulator included in Apple's Mac OS X operating system. It originated in Mac OS X's predecessors, NEXTSTEP and OPENSTEP, and allows the user to interact with the computer through a command line interface.

    Cost / License

    • Free
    • Proprietary

    Application types

    Platforms

    • Mac
    Terminal screenshot 1
  5. GNOME Terminal is a terminal emulator for the GNOME desktop environment written by Havoc Pennington and others. Terminal emulators allow users to execute commands using a real UNIX shell while remaining on their graphical desktop.

    Cost / License

    • Free
    • Open Source

    Application type

    Platforms

    • Linux
    • BSD
    • GNOME
    GNOME Terminal screenshot 1
  6. Git icon
     Like

    Git is a free & open source, distributed version control system designed to handle everything from small to very large projects with speed and efficiency.

    Cost / License

    • Free
    • Open Source

    Application type

    Platforms

    • Mac
    • Windows
    • Linux
    • Android
    • iPhone
    • Chrome OS
    • Android Tablet
    • BSD
    • Linux Mobile
    • Haiku
  7. Programming

    Python and R are the most commonly used programming languages. I've included additional IDEs (Integrated Development Environments) as well, two for Python (one with a desktop GUI, the other for the terminal) and one for R.

  8. Jupyter icon
     Like

    The Jupyter Notebook is a web application that allows you to create and share documents that contain live code, equations, visualizations and explanatory text. Uses include: data cleaning and transformation, numerical simulation, statistical modeling, machine learning and much...

    Cost / License

    • Free
    • Open Source

    Application type

    Platforms

    • Mac
    • Windows
    • Linux
    • Online
    • Cloudron
    The Jupyter Notebook is an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and narrative text. Uses include: data cleaning and transformation, numerical simulation, statistical modeling, data visualization, machine learning, and much more.
  9. IPython icon
     Like

    Provides a versatile architecture for interactive computing featuring dynamic shell, data visualization, and parallel computing capabilities.

    Cost / License

    • Free
    • Open Source

    Application types

    Platforms

    • Mac
    • Windows
    • Linux
    • Python
    Normal startup
    Time-saving features: TAB completion, automatic parentheses and quotes for function calls and information about defined variables
    IPython screenshot 2
  10. R is a free software environment for statistical computing and graphics. It is a GNU project which is similar to the S language and environment which was developed at Bell Laboratories (formerly AT&T, now Lucent Technologies) by John Chambers and colleagues. R can be consider.

    Cost / License

    • Free
    • Open Source

    Platforms

    • Mac
    • Windows
    • Linux
    • BSD
    R (programming language) screenshot 1
    On Unix
    On Mac
  11. RStudio icon
     Like

    RStudio™ is an integrated development environment (IDE) for R. RStudio combines an intuitive user interface with powerful coding tools to help you get the most out of R.

    Cost / License

    • Free
    • Open Source

    Application type

    Platforms

    • Mac
    • Windows
    • Linux
    • Xfce
    RStudio on Windows
    RStudio on Mac OS X
    RStudio on Ubuntu Linux
    +1
    RStudio screenshot 3
  12. Data Visualization and Manipulation

    Matplotlib is a basic data visualization tool. SciPy is a great choice for manipulating data and TensorFlow is a fantastic platform if you are interested in machine learning (especially running with Keras.io and scikit-learn).

  13. Matplotlib is a comprehensive library for creating static, animated, and interactive visualizations in Python. Matplotlib makes easy things easy and hard things possible.

    Cost / License

    • Free
    • Open Source

    Platforms

    • Mac
    • Windows
    • Linux
    • Python
    • BSD
    Matplotlib screenshot 1
    Matplotlib screenshot 1
    Matplotlib screenshot 2
    +1
    Matplotlib screenshot 3
  14. SciPy icon
     Like

    SciPy is a collection of mathematical algorithms and convenience functions built on NumPy icon NumPy. It adds significant power to Python by providing the user with high-level commands and classes for manipulating and visualizing data.

    Cost / License

    • Free
    • Open Source

    Platforms

    • Mac
    • Windows
    • Linux
    • BSD
    • Python
    SciPy screenshot 1
    SciPy screenshot 1
    SciPy screenshot 2
  15. TensorFlow is an open source software library for machine learning in various kinds of perceptual and language understanding tasks.

    Cost / License

    • Free
    • Open Source

    Platforms

    • Mac
    • Linux
    • Windows
    • Python
  16. Databases

    Below are the most commonly used databases for raw data processing power. I've included a relational database and a noSQL database for handing document driven data, particularly useful in the big-data space!

  17. PostgreSQL is a powerful, open source object-relational database system with over 35 years of active development that has earned it a strong reputation for reliability, feature robustness, and performance.

    Cost / License

    • Free
    • Open Source

    Application type

    Platforms

    • Mac
    • Windows
    • Linux
    • BSD
    • Self-Hosted
    PostgreSQL screenshot 1
    PostgreSQL syntax example
  18. pgAdmin icon
     Like

    pgAdmin is the most popular and feature rich open source administration and development platform for PostgreSQL icon PostgreSQL, the most advanced open source database in the world.

    Cost / License

    • Free
    • Open Source

    Application type

    Platforms

    • Mac
    • Windows
    • Linux
    • BSD
    • Self-Hosted
    • Flathub
    • Flatpak
    Welcome screen, standard theme
    Welcome screen, dark theme
    Server dashboard
    +2
    Query tool & data editor
  19. MongoDB icon
     Like

    MongoDB is a document database with the scalability and flexibility that you want with the querying and indexing that you need

    Cost / License

    • Freemium
    • Proprietary

    Application type

    Platforms

    • Mac
    • Windows
    • Linux
    • Online
    • BSD
  20. The GUI for MongoDB. Visually explore your data. Run ad hoc queries in seconds. Interact with your data with full CRUD. View and optimize your query performance. Compass empowers you to make smarter decisions about indexing, document validation, etc.

    Cost / License

    • Free
    • Proprietary

    Platforms

    • Mac
    • Windows
    • Linux
    MongoDB Compass screenshot 1
    MongoDB Compass screenshot 1
    MongoDB Compass screenshot 2
    +1
    MongoDB Compass screenshot 3
  21. Business Data

    I've seen the following used frequently for data visualization on the business side. Pick one or more!

  22. D3.js icon
     Like

    D3.js is a JavaScript library for manipulating documents based on data. D3 helps you bring data to life using HTML, SVG, and CSS.

    Cost / License

    • Free
    • Open Source

    Application type

    Platforms

    • Online
    • Self-Hosted
    D3.js screenshot 1
    D3.js screenshot 1
    D3.js screenshot 2
    +6
    D3.js screenshot 3
  23. Enhance your decision-making by transforming diverse data sources into engaging, interactive visual insights with tools like Power BI Desktop and a web service.

    Cost / License

    • Freemium
    • Proprietary

    Platforms

    • Windows
    • Online
    • Android
    • iPhone
    • Android Tablet
    • iPad
    • Microsoft Power BI
    • Microsoft 365
    • Microsoft Excel
    Microsoft Power BI screenshot 1
    Microsoft Power BI screenshot 1
    Microsoft Power BI screenshot 2
    +8
    Microsoft Power BI screenshot 3
  24. Tableau icon
     Like

    Tableau helps the world’s largest organizations unleash the power of their most valuable assets: their data and their people.

    Cost / License

    • Paid
    • Proprietary

    Platforms

    • Mac
    • Windows
    • Online
    • Self-Hosted
    Tableau screenshot 1
    Tableau screenshot 1
    Tableau screenshot 2
    +9
    Tableau screenshot 3
  25. Microsoft Excel, part of the Microsoft 365 Copilot icon Microsoft 365 Copilot, is Microsoft's spreadsheet application. With the Microsoft Office Fluent user interface, rich data visualization, pivot table views, and professional-looking charts are easier to create...

    Cost / License

    • Paid
    • Proprietary

    Platforms

    • Mac
    • Windows
    • Android
    • iPhone
    • Android Tablet
    • Windows Phone
    • iPad
    excel 365
    A new document in Excel 2013 on Windows 8
  26. Distributed Cloud Computing

    Some people prefer using the cloud to do their dirty data-processing work. Pick one and go for it!

  27. Apache Hadoop is a open source software framework that supports data-intensive distributed applications licensed under the Apache v2 license. It enables applications to work with thousands of computational independent computers and petabytes of data.

    Cost / License

    • Free
    • Open Source

    Platforms

    • Mac
    • Windows
    • Linux
  28. The Azure cloud platform is more than 200 products and cloud services designed to help you bring new solutions to life—to solve today’s challenges and create the future. Build, run, and manage applications across multiple clouds, on-premises, and at the edge, with the tools and...

    Cost / License

    • Paid
    • Proprietary

    Platforms

    • Online
    • Android
    • Android Tablet
    • iPhone
    • iPad
    Microsoft Azure screenshot 1
    Microsoft Azure screenshot 2
    Microsoft Azure screenshot 3
  29. Amazon Machine Learning allows developers to use machine learning. It provides visualization tools and wizards that guide you in the process of creating machine learning (ML) models. It makes it easy to obtain predictions using simple APIs.

    Cost / License

    • Paid
    • Proprietary

    Platforms

    • Online
    • Amazon Web Services
No comments so far, maybe you want to be first?
Gu