What's happening at Kantai? A collection of news and random topics related to our work.

Happy New Year 2022 with two new releases

Saturday, January 1, 2022

Bandsaw 0.3 and Multimeter 0.1

Happy new year, everyone! We celebrate occasion by announcing 2 new releases. In the past month we worked on functionality that improves the insight into workflows by collecting metrics, logging and tracking the individual runs. All this work culminated in to two new releases:

Measure resource usages of your python code using “Multimeter”

In december we published a new library “Multimeter” which makes it easy to collect run-time metrics like cpu usage, memory consumption or io operations of python code. The collected data can automatically be pushed to time series databases like InfluxDB and visualized with Grafana.

Visualization of metrics

A more thorough description of the library can be found at our documentation hub.

Track your workflows and their resources

Building upon on “Multimeter”, version 0.3 of our library “Bandsaw” brings new features that allow to track runs of your workflows and the consumption of computation resources of their individual tasks. This gives the possibility to compare individual runs and identify tasks that are bottlenecks in the workflow. The collected metrics and the logging messages for specific executions are stored for later inspection. Here a list with the major new additions:

  • New TrackerExtension to keep track of workflow runs, executions of tasks, their results and attached files.
  • Add new MetricsAdvice that gathers metrics when executing tasks
  • LoggingAdvice adds per-session log file as attachment
  • JsonFormatter for storing log items in a structured format

As always the new version has been published on PyPI and running pip install bandsaw will give you the latest version. The documentation has been updated as well.

Our next plans

Now in January we will focus on the last big missing piece of our fundamental libraries, the data management library.

  • 0.4: Support for tracking data.
  • 0.5: Run tasks asynchronously using a scheduler.

If you like what we are doing or if you have question, please join us on discord.






Bandsaw now with SSH support

Friday, November 12, 2021

Shortly after our initial release, we are proud to announce the next milestone for our workflow library Bandsaw. With our today’s release of version 0.2, we provide now the first advice that is capable of transferring tasks to remote machines and running them there. This enables our users to use bandsaw in real world processes that depend on platforms that are not the development workstations of developers.

Bandsaw allows defining multiple different remote interpreters in its configuration and provides an easy way of distributing tasks to these remote machines. The tasks and their data including the necessary code to run them are distributed using the SSH command line tools underneath, so bandsaw doesn’t require new python dependencies. This allows bandsaw to automate parts of the process that would previously be done manually, such as:

  • Packaging and transferring code to remote machine
  • Manually logging in to remote system and running the task
  • Copying input and output data of the task between developer machine and computation platform

The new version has been published on PyPI, so running pip install bandsaw should give you the latest version. We updated the latest documentation and included some instructions for how to use the new SSH feature.

With SSH support being ready we continue now with our next two milestones:

  • 0.3: Track computation resource metrics.
  • 0.5: Run tasks asynchronously using a scheduler.

Besides, we are currently working on an additional product that we plan to deliver by the end of the year. Stay tuned! As always, feel free to reach out by opening feature request, sending us emails or drop by for support or a talk in discord.




First release of new library "Bandsaw"

Friday, October 29, 2021

Finally, we got our first version of one of our products out of the door. Yesterday we successfully built and published version 0.1 of our new library “Bandsaw”.

Bandsaw will be the foundation of our products. It is a python library for splitting up a python workflow into individual tasks, that can be run separately and on different platforms. The idea behind it is that breaking a complete workflow into smaller pieces allows to optimize each individual task independently of each other.

Our initial release is not feature complete yet, but it should allow to give some idea about the direction we are heading. With 0.1 the library supports already workflows that run on a single machine. Here is a short list of features that are already available:

  • Splitting workflow into smaller tasks
  • Adding additional logging to each task
  • Cache individual task results in file system to speed up multiple runs.
  • Running tasks in sub processes for better isolation
  • Use different python interpreters per task which allows combining different frameworks with conflicting dependencies.

The package has been published on PyPI, so installing it is as easy as running pip install bandsaw. Additionally, we published documentation including a detailed user guide that explains the inner workings of bandsaw. For all of you people who want to “use the source”, the code can be found in our GitLab repository, enjoy. The code is released under the MIT license, so there shouldn’t be any problems with trying it out.

Now, what’s next for bandsaw? There is still some way to go until we are feature complete and can release our version 1.0, but we think that it won’t take us too long with the foundations being in place. Our preliminary roadmap looks like this:

  • 0.2: Support for remote execution of tasks using SSH.
  • 0.3: Track computation resource metrics.
  • 0.5: Run tasks asynchronously using a scheduler.

So much for now and our plans. Have fun and don’t hesitate to reach out for questions, ideas or all kind of feedback.