Start to a Series
Welcome to the first article of my Python Deployment series! I'll be writing a series of articles for creating a robust deployment pipeline suited for a production Python application. With learning how to setup the infrastructure, what are the best tools to use, and how to use them, you'll be able to confidently commit to an existing project with an increased certainty that it won't break anything. While it's focused on Python, the ideas behind the series should also be a good starting point for setting this up in any other language. For a quick look ahead, I'll list the current idea of how the articles will be laid out,
- What is continuous integration and continuous deployment?
- Creating a FREE automated pipeline with GitHub Actions
- Adding unit testing to your Python deployment
- Adding linting to your Python deployment
- How to automatically format Python
- How to type-hint your Python project
- Continuous deployment using Google Cloud Platform (GCP)
Introduction
While using things like manual deployments and locally running tests at the beginning of a project can help ship quickly, it typically becomes cumbersome and risky to use these as the codebase grows in complexity. With issues like untested codes, non-isolated local environments, and code-smell, having a codebase that manually delivers updates can quickly become a headache. This is where continuous integration and continuous deployment come into play! Although they're separate concepts, they are usually talked about together as it's rare to see one being implemented without the other.
What is continuous integration and continuous deployment?
Typically referred to as CI/CD, they are both concepts of how new versions of code should be delivered to its end users. Continuous integration has to do with all the processes to verify that any new code aligns with the standards of the codebase. This can include things like unit testing or integration testing, code linting, formatting, and much more. Continuous deployment is the automatic or semi-automatic release of new code which can include new features, bug fixes, etc. Additionally, continuous deployment is typically executed after all continuous integration checks have passed.
Continuous Integration
I like to think of Continuous integration (CI) as
The practice of merging new code into your project in a way that reduces the chance of breaking old functionality, creating bugs, or introducing code smell.
For my example, I'll be using unit testing to show how having a CI process can help. Now you might be wondering what that means, so lets look through examples with and without continuous integration.
Without CI
In my early days of development, I would either commit directly to master (now main) or open an unchecked PR and hope for the best. While this allowed for quick development in the short term, it led to a world of headaches and even slower development speed as the code and/or organization grew in complexity.
Lets say I initially create a simple exponential function which I use that later in the code without passing in power
as I'm okay with the default value of 2
like so,
def exp(base: float, power: float = 2) -> float:
return base**power
exp(4) # 16
Now let's say I go back and say "Hm I don't like that power
has a default value so I'm going to make it required" BUT I don't update all the calls of exp
we'll get a TypeError
.
exp(4) # TypeError: exp() missing 1 required positional argument: 'power'
If this function was properly tested, once the function is changed we should have the test fail which would indicate we're making something backwards-incompatible. While this is an extremely simple example, you can see how it'd be much easier to cause these issues if you have 10,000+ lines of code in a project. Additionally, some code doesn't have the benefit of being fully incapsulated so you have to be very careful when introducing backwards-incompatibility. Imagine if pandas
started working differently on the next release without notifying the community, that wouldn't be good!
With CI
Now let's add a unit test and then break it! How fun! Before we start, a unit test is code that tests (duh) a unit (duh) of the code which is subjective but typically a function, class method, etc. So for this we'll make a test of exp
,
def test_exp():
assert exp(2,2) == 4
assert exp(2) == 4
Initially no issue! But once we remove the default argument, we see how the second argument would throw a TypeError
and we'd be alerted by our CI system.
Continuous Deployment
I like to think of Continuous deployment (CD) as
Delivering new versions of a software thought a fully automated process.
While this is more generic it is just as useful when wanting to efficiently ship new versions of your program without all the hassle. It's less defensive programming, in the sense of stopping bugs like CI, but saves the headaches and monotony of manual deployments. A good example of this is deploying new versions of your python package to PyPI.
Although typically not done for this reason, CD can have the benefit of lessening the chance of deploying the wrong version / branch of your code. With manual deployment, you may pull the wrong code or be on the wrong branch while an automated process will always grab from the same production source.
Summary
Thanks for reading! In the later articles I dive into a specific subject or tool within CI/CD, explain the concepts, and show how to implement them! Stay tuned for some exciting stuff. In the meantime, check out my other recent articles about scalable and maintainable software,