4 min to read
Basic Tutorials Part 0
Basic Tutorials Part 0.
Overview:
We will follow a bunch of libraries in the course and this post intends to provide an overview of the same and justification for using these. If you have a different viewpoint or want to question us on any of these points, please leave a comment below and we’d love to chat about it.
That said, we will probably stick to these options because after extensive exploration, we have concluded that these are the best options for us. But we’re open to any suggestions and places for improvements!
Git and GitHub
A version control system (or VCS) provides an automatic way to track changes in software projects, giving creators the power to view previous versions of files and directories, develop speculative features without disrupting the main development, securely back up the project and its history, and collaborate easily and conveniently with others. In addition, using version control also makes deploying production websites and web applications much easier.
At a certain point, you might end up breaking your project and you might want to do a little version control. Git to the rescue!
Github: Central service for hosting Git repositories.
It’s a place to brag about repositories by showcasing them and at the same point a place to host the code to allow contributors to add to the Project.
Terminal
Now many of us here would be beginners and might pick a GUI over a command line, but it turns it that command line can be better than GUI. Some functionalities are just faster done with Command Line, and when you use a cloud service. You will need to use a Command Line to perform actions on your machine.
Most of the commands are run in BASH which is a shell that runs your commands in the background.
A terminal is the interface to the BASH environment on your machine. When you use a terminal, you use the interface to access the BASH environment on your machine
Anaconda environment
Anaconda is another Open Source project that is the most used amongs the Data Science world. We will chiefly be using Conda and Jupyter Notebooks.
Package, dependency and environment management for any language—Python, R, Ruby, Lua, Scala, Java, JavaScript, C/ C++, FORTRAN
Conda is an open source package management system and environment management system that runs on Windows, macOS and Linux.
Conda quickly installs, runs and updates packages and their dependencies. Conda easily creates, saves, loads and switches between environments on your local computer. It was created for Python programs, but it can package and distribute software for any language.
While creating a Project, you will require various libraries and dependencies. Some projects will require a certain set of libraries which will work only with a given version of a set of other libraries, but at the same point you might want to work on a different set of projects.
To help with this, Conda creates separate ‘environments’. A environment X with a set of libraries is independent and unaffected by another environment Y. Thus you can work on your given projects without worrying about ‘breaking’ the requirements everytime you install something-when you do, conda ensures that the ‘environment’ works in cohesion by changing other libraries.
Another point worth mentioning is, Open Source projects share their ‘source’ code; which needs to be compiled everytime you want to use it. However, compiling a huge library can be tedious and time taking. Conda provides precompiled libraries to be downloaded whenever you need to install something new. So you just have to download it and can dive right in!
A detailed tutorial of using Conda and Jupyter notebooks will be shared in these series.
Python
Python is the intensely used by Deep learning practitioners to cutting edge researchers.
It’s an Open Source langauge that has gained tremendous fame in the recent years
This post serves as a basic overview, hence many technical details will be skipped here.
The reasons why we want to use this are:
- Huge community
- Interface with Low Level Languages
- Large number of Libraries available.
This series will feature an introduction to Python programming and using it’s two most important libraries for Data Science:
- Numpy
- Pandas
Leave us a comment below if there is anything you want to discuss.
Subscribe to my weekly-ish [Newsletter](https://tinyletter.com/sanyambhutani/) for a set of curated Deep Learning reads
Comments