Kaggle


Course: How to win a data science competition: learn from top Kagglers

Instructors

Dmitry Ulyanov, Alexander Guschin, Mikhail Trofimov, Dmitry Altukhov, Marios Michailidis* National Research University Higher School of Economics

Course Link
My Notes


Kaggle perfect scores

The public leaderboard is computed based on a fraction of the test set. The private leaderboard is computed on the whole test set. If you overfit the test set you can do very well in the public leaderboard and very bad in the private one.


How to download data using Kaggle API?

# 1) Install kaggle package
pip install --user kaggle
# Note: pip install kaggle <== This command cause problems

# 2) Download and save your kaggle API token
'Go to https://www.kaggle.com/username/account ==> Create API Token'  ==>  Download API token (This will trigger the download of kaggle.json, a file containing your API credentials.)'

 # Place this file in the location ~/.kaggle/kaggle.json

 # Go to the directory where the kaggle.json was saved
  sudo cp kaggle.json ~/.kaggle/kaggle.json

  sudo cp kaggle.json /Users/j/.local/lib/python3.7/site-packages/kaggle/kaggle.json

 # Ensure other users don't have access to the key
sudo chmod 600 ~/.kaggle/kaggle.json

# 3) Download competition data in your project

 # List files in a competition
kaggle competitions download -c quora-insincere-questions-classification

 # Download all files in a competition
kaggle competitions files quora-insincere-questions-classification

 # Download your notebook mynotebook.ipynb from the kaggle kernel
kaggle kernels pull usrname/mynotebook


# 4) Add kaggle toyour path
 # Open .zshrc
 vi ~/.zshrc

 # add alias kaggle="/Users/j/.local/lib/python3.7/site-packages/kaggle"
 source ~/.zshrc

 # Save and run
 zsh

How to upload functions in Kaggle kernel?

# 1. Upload your awesome_function.py file as a Dataset in your Notebook/Script kernel

In your running Notebook/Script kernel, click 'Add Dataset' => Select 'Dataset' => Click 'Upload a Dataset' => Upload your awesome_function.py file from local/GitHub => Enter a title for your dataset e.g. 'kutil' (This will in fact create a subfolder 'kutil')=> Click 'Create'


# 2. Your Notebook/Script kernel will refresh.
The awesome_function.py is saved in '../input/kutils/'


# 3. Copy the awesome_function.py to kaggle working directory '/kaggle/working'
from shutil import copyfile

copyfile(src = "../input/kutil/awesome_function.py", dst = '/kaggle/working/awesome_function.py')


# 4. Import functions from awesome_function.py to your Notebook/Script kernel
from awsome_function import *

References