Building a python package: Reflections from an R user

In this post I note down some of my experiences with making my first python package, specifically highlighting some of the similarities and differences between R and python when it comes to package building. My hope is that R users looking to expand their pythonic horizons might find something useful!

Harry Fisher https://hfshr.netlify.app/
08-02-2020

In an attempt to improve my python skills, I thought it would be fun to try and create a simple python package. To ease myself into the world of python packages, I set out to implement a very simple version of my bitmexr R package - which I imaginatively called bitmexpy. For context, bitmexr is a simple API wrapper for www.bitmex.com that supports obtaining historical trade data as well as placing trades on the exchange. If interested, you can see bitmexr in action here. For bitmexpy, I’m just going to stick to implementing the historical trade element, and leave the other functions for another day!

Package code

Central to any package is the code that facilitates what the package does. In R this is always included in the R subdirectory of the package directory. For example bitmexr has the following files:


dir_tree("~/Downloads/ProjectR/bitmexr/R/")

~/Downloads/ProjectR/bitmexr/R/
├── bitmexr.R
├── bucket_trades.R
├── delete_orders.R
├── edit_order.R
├── execute_order.R
├── general.R
├── map_bucket_trades.R
├── map_trades.R
├── trades.R
└── utils.R

This is fairly similar in python, but there are a few differences.

For example, here is the directory for the package code for bitmexpy:


dir_tree("~/Downloads/Python/bitmexpy/bitmexpy/")

~/Downloads/Python/bitmexpy/bitmexpy/
├── __init__.py
├── bucket.py
├── symbols.py
└── trades.py

Each of these files are modules within the package and can be imported separately if desired. For example, after installing the package (I’ll cover that later in the post) I could use:


from bitmexpy import bucket

To load only the bucket module from bitmexpy, and in order to you one of the functions I could use:


bucket.bucket_trades().iloc[0:5, 0:6]

                  timestamp  symbol     open     high      low    close
0 2020-08-02 00:00:00+00:00  XBTUSD  11361.0  11908.5  11246.0  11827.5
1 2020-08-01 00:00:00+00:00  XBTUSD  11127.5  11470.0  10970.5  11361.0
2 2020-07-31 00:00:00+00:00  XBTUSD  11108.5  11198.0  10845.5  11127.5
3 2020-07-30 00:00:00+00:00  XBTUSD  10941.5  11362.5  10865.0  11108.5
4 2020-07-29 00:00:00+00:00  XBTUSD  11063.5  11270.0  10603.0  10941.5

These modules can also import each other. For example, both bucket.py and trades.py have the line:

from bitmexpy import symbols

at the start of the file.

Here is the code for symbols.py as an example:


import pandas as pd
import requests

def available_symbols():

    df = requests.get("https://www.bitmex.com/api/bitcoincharts")
    result = pd.read_json(df.content, orient="index").iloc[0]
    result = result.tolist()
    return result

AS = available_symbols()

AS is a list of all available symbols on the BitMEX exchange. This is then used in the other functions in the package by calling symbols.AS as a simple erro r handling technique, so if you specify the wrong symbol like:


from bitmexpy import bucket

bucket.bucket_trades(symbol="TESLA")

Symbol must be one of:
XBTUSD
ETHUSD
XRPUSD
XBTU20
XBTZ20
ETHU20
LTCU20
XRPU20
BCHU20
ADAU20
EOSU20
TRXU20
ETHUSDU20
BCHUSD
LTCUSD

A list of all valid symbols is given. If interested, you can check out the rest of the code in the package git repo for bitmexpy here.

Package metadata

For the package metadata, I found there were quite a few similarities with R packages. Below are some of the files/directories you can expect to find in a python package:


from setuptools import setup, find_packages

with open("README.rst", "r") as fh:
    long_description = fh.read()

setup(
    name="bitmexpy", 
    version="0.0.3",
    author="Harry Fisher",
    author_email="harryfisher21@gmail.com",
    description="python client for BitMEX's API",
    long_description=long_description,
    long_description_content_type="text/markdown",
    url="https://github.com/hfshr/bitmexpy",
    packages=find_packages(),
    classifiers=[
        "Programming Language :: Python :: 3",
        "License :: OSI Approved :: MIT License",
        "Operating System :: OS Independent",
    ],
    install_requires=[
        'pandas',
        'requests',
        'numpy',
        'datetime'
    ],
    python_requires=">=3.6",
)

As you can see from the example above, a setup.py file lets you:

setup.py is also important for building/installing your package as we’ll see in the next section.

Distribution

Aside from creating a git repository for your package, the methods of distributing packages in R and python are somewhat different.

With a setup.py file, you can install the package locally by running from the package directory:

While you can now use the package locally, this isn’t much help for sharing your package with others.

In order to share your package with others, the python package index (pypi) is a good choice. Conveniently, pypi has a test instance of the main index, allowing you to upload your small test packages without cluttering the main index.

To prepare for uploading your package to pypi, you first need an account - which is easy enough to set up here: https://test.pypi.org/

Next, in the package directory run:

This will create a dist/ directory with the files like:


fs::dir_tree("~/Downloads/Python/bitmexpy/dist/")

~/Downloads/Python/bitmexpy/dist/
├── bitmexpy-0.0.5-py3-none-any.whl
└── bitmexpy-0.0.5.tar.gz

Now you can upload these files to pypi using twine.

You will be prompted for your username and password for your testpypi account, and if all is successful you get a link to your uploaded package.

For example, you can find bitmexpy here: https://test.pypi.org/project/bitmexpy/0.0.5/

To install the package from pypi, you can run:

Note however, not all packages on pypi are not maintained like packages on CRAN, and therefore, the quality of packages may vary considerably…!

Summary

So that was a quick tour of an R users attempt to understand the basics of python package building. There are certainly things I’ve missed here and bitmexpy needs a lot more work to get it fully functional - however python packages feel a lot less mysterious now and I have a better appreciation for how they work in general.

For further information, I found these resources useful when working on the package:

So definitely worth checking out if you’re interested.

Thanks for reading!

Corrections

If you see mistakes or want to suggest changes, please create an issue on the source repository.

Reuse

Text and figures are licensed under Creative Commons Attribution CC BY 4.0. Source code is available at https://github.com/hfshr/distill_blog, unless otherwise noted. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".

Citation

For attribution, please cite this work as

Fisher (2020, Aug. 2). Data, Code & Coffee: Building a python package: Reflections from an R user. Retrieved from https://hfshr.xyz/posts/2020-08-02-python-package/

BibTeX citation

@misc{fisher2020building,
  author = {Fisher, Harry},
  title = {Data, Code & Coffee: Building a python package: Reflections from an R user},
  url = {https://hfshr.xyz/posts/2020-08-02-python-package/},
  year = {2020}
}