blog post

Introduction

In Stackoverflows May 2023 survey, Rust was the most admired programming language. Also, in last Novembers results of Octoverse, the state of open source and ai, we see a tremendous growth of Rust. Over the last few years we have also noticed that Rust in Python is becoming more and more popular. That is because libraries have released new versions which are (partly) rewritten in Rust. You can think of Pydantic, where it’s core validation logic in Pydantic v2 was rewritten in Rust. For other libraries new tools were built as a faser alternative, such as Polars as an alternative for Pandas. Another example is Ruff developed by Astral. In this blog I’m going to talk about Ruff.

What is Ruff?

According to the documentation, Ruff is an extremely fast Python linter and code formatter, written in Rust.

In the benchmark you can see that it is indeed much faster then for example Flake8. Of course, this depends on the size of your codebase. The CPython repository is huge, and for more regular sized projects (~10k lines) the difference is much smaller, and sometimes I don’t even notice it. On their website, they also say:

Ruff can be used to replace Flake8 (plus dozens of plugins), Black , isort, pydocstyle , pyupgrade , autoflake and more, all while executing tens or hundreds of times faster than any individual tool.

This is what convinced me to experiment with Ruff. Although we only used Black and Flake8 in our projects, I do prefer to use only one tool for my linting and formatting tasks. This means less dependencies and you only have to configure one tool.

In this short blog I’m not going to demonstrate how Ruff works. Their documentation is quite extensive, with really good examples. So, I would only be copy-pasting their examples. Instead, I’ll will briefly talk about my first experiences with Ruff.

My experience

I started by reading their own tutorial and this is in fact everything you need to read to get started. It’s very brief, but it covers all the relevant topics. From installation to the basic configuration options, a couple of examples and an instruction to use Ruff in a pre-commit hook. Within less then 15 minutes I had my first setup with Ruff in one of my projects.

On the overview page of their website they claim to have a Drop-in parity with Flake8, isort and Black. In my first runs of ruff format, which is comparable to black and ruff check (in my case this this should replace flake8), I did notice some differences with running black and flake8. Nothing to worry about, this only means that Ruff ’s default setup was not completely equal to my setup of flake8 and black. All I had to do was configure my formatting and linting preferences in my pyproject.toml file (or a ruff.toml file if you prefer to have a separate file). Configuring Ruff turned out to be very easy. They use clear, distinctive sections in a toml file for the different purposes like general settings as line length, formatting, isort , linting and docstring styling.

For all my preferences I was able to find examples on their website, but in some cases that took some time. Examples and explanation of settings are a bit spread out over different chapters on their website The Ruff Linter, The Ruff Formatter, Configuring Ruff and Settings . For me, it was not immediately clear on which page to look for my solution. For example, in our team we agree to sort imports independently from their import style.

So, we prefer:

Instead of:

This is not the default in Ruff and it took a couple of minutes to find the correct setting in their documentation. (Which, by the way, is this setting ). Of course, maybe it’s just me, but I find the different sections I mentioned above a little bit confusing.

Besides this, I’m very happy about how everything works. Especially, the –fix option for the linter comes in very handy, since I used flake8 and there I had to make the adjustments manually. I’ve implemented it in all our projects with a pre-commit hook. The first couple of commits, I first ran ruff format . and ruff check . manually just to get some confidence in the changes it made related to formatting, and the linting issues it found. But, soon I was confident the formatter, and linter did exactly what I wanted it to do, and I simply rely on the pre-commit hook, where I run ruff check . –fix , so it also automatically fixes linting issues.

Conclusion

After working with Ruff for a couple of weeks, the only little negative feedback I have so far is about the documentation. Needless to say that I would definitely recommend using Ruff over a combination of other linters, and/or formatters. Personally, I don’t really care about it’s speed, but yes, it is faster. The projects I’m working on are not that big that it makes a real difference. I do care about the ease of use, and the fact that you only have one dependency which also means one tool to configure. Especially, in combination with a pre-commit hook, it makes your life much easier. Make sure you agree on all formatting and styling guidelines in your team, and apply the same configuration in all projects. Now, you don’t have to worry about the styling, and formatting of your projects anymore. I’m really looking forward to uv, a new tool for packaging developed by the same team, Astral.

Thoughtworks Technology Radar

Ruff appeared on the ****Thoughtworks Technology Radar ****for the first time in April ‘23 with the status Trial. Later, in September ‘23, it gained the status Adopt, meaning that they feel strongly that the industry should adopt this item. See Ruff on Technology Radar. I completely agree on this status development, Trial -> Adopt. Ruff is mature enough and has proven to be a very useful and reliable linter and formatter. I would definitely recommend using Ruff over a combination of flake8, isort and black. In the case your team only uses a linter like flake8 and the codebase is relatively small, I think I would stick with flake8 and not bother switching. Simple, because for small codebases there is almost no speed gain and you are just trading one dependency for another.

Related Articles

blog-post

CD4ML: Ability to reproduce

Introduction Barry is a data scientist working for a small data science and engineer consulting company based in …