Blog

What's happening at Kantai? A collection of news and random topics related to our work.

Our first "real" (paid) product: Glossa.cc

Tuesday, March 15, 2022

Embedded comments for everyone

Hi everyone, and sorry for the last weeks of silence.

What started as a simple functionality we needed for our own needs, grew into its own product. When looking for the easiest way to let users comment on content of our reports, we didn’t find any good solutions, that were easy to integrate. So we started to prototype a solution on our own, and here we are, after 6 weeks of work announcing our first SaaS product:

Glossa.cc - An easy way to integrate user comments on arbitrary content

Today, we are happy to announce the launch of glossa.cc, our new service that allows everyone to include content commenting functionality on every web page, just by including 3 lines of HTML code.

We are totally excited about this product, because it contains a LOT of firsts, that were necessary to bring it to life.

It’s the first product, we actually sell. So we had to set up our whole new backend infrastructure for billing and automatic subscription management. Technology-wise it’s our first project where we rely solely on serverless and built a 100% serverless architecture. The first which contains a web-based user frontend and the first which is not focused on machine learning at all, but can be used by people and business for different purposes.

Glossa.cc comment dialog

If all of this sounds interesting to you, here are some things you can do:

  1. Head over to glossa.cc and learn more about Glossa.
  2. Take a look at our demo.glossa.cc page, which allows you to try out how it integrats into different types of web pages.
  3. Create a trial account at admin.glossa.cc and integrate it into your own web pages to get a feeling if it is a good fit for your needs.
  4. And most importantly: Reach out to us! Be it via linkedin, twitter, email or join our discord. We LOVE to get your feedback on this!

What is coming next?

Glossa and its capabilities sparked a lot of new ideas and opened the door to more services and potential products that we believe could help people. Therefore, we will try a couple of things over the next weeks and see what ideas might be worthwhile and what are just pipe dreams.

In parallel, we will take the feedback we get for glossa, improve and polish a few things that we left on the table and then think about which new features we should integrate. High on our own priority list is a real-time update for new comments, but we’ll see if this is something that people would love, too.

Thanks for all your interest and support, we would love to get your opinion.

Cheers,

Christoph

References

Keep your data in order by using boxs

Thursday, February 3, 2022

Boxs 0.1 joins the ranks of Kantai's Open-Source libraries

Good news everyone!

With the first release of our data managing library “Boxs” we are happy to complete our set of foundational libraries for creating machine learning processes.

Boxs helps you to organize your data and artifacts that are created in your workflow. No more need for always changing file paths and S3 keys or sprinkling your code with functions that write values to file or upload them to the cloud. Boxs takes care of this with its simple API. Define your own set of boxes and put related values together in the same box. All artifacts are tracked with their lineage and across multiple runs of the same script. A command line interface lets you inspect your data easily and even compare the same data item across different runs.

Organization of data across multiple boxes and runs

The new library can easily be integrated into Bandsaw, our tool for breaking up a process into individual parts.

A more thorough description of the library can be found at our documentation hub.

What is coming next?

Boxs is currently limited to storing data in the file system, so it requires a distributed file system if being used by a distributed process. To remove this limit, the next version of Boxs will include storage implementations that allow to use cloud storage services instead, so that workflows running in different regions or across different cloud platforms are supported, too.

Additionally, work on our first commercial product has begun, a service that allows to monitor your processes and discover optimization potential, stay tuned!

Cheers,

Christoph

References

Boxs

Happy New Year 2022 with two new releases

Saturday, January 1, 2022

Bandsaw 0.3 and Multimeter 0.1

Happy new year, everyone! We celebrate occasion by announcing 2 new releases. In the past month we worked on functionality that improves the insight into workflows by collecting metrics, logging and tracking the individual runs. All this work culminated in to two new releases:

Measure resource usages of your python code using “Multimeter”

In december we published a new library “Multimeter” which makes it easy to collect run-time metrics like cpu usage, memory consumption or io operations of python code. The collected data can automatically be pushed to time series databases like InfluxDB and visualized with Grafana.

Visualization of metrics

A more thorough description of the library can be found at our documentation hub.

Track your workflows and their resources

Building upon on “Multimeter”, version 0.3 of our library “Bandsaw” brings new features that allow to track runs of your workflows and the consumption of computation resources of their individual tasks. This gives the possibility to compare individual runs and identify tasks that are bottlenecks in the workflow. The collected metrics and the logging messages for specific executions are stored for later inspection. Here a list with the major new additions:

  • New TrackerExtension to keep track of workflow runs, executions of tasks, their results and attached files.
  • Add new MetricsAdvice that gathers metrics when executing tasks
  • LoggingAdvice adds per-session log file as attachment
  • JsonFormatter for storing log items in a structured format

As always the new version has been published on PyPI and running pip install bandsaw will give you the latest version. The documentation has been updated as well.

Our next plans

Now in January we will focus on the last big missing piece of our fundamental libraries, the data management library.

  • 0.4: Support for tracking data.
  • 0.5: Run tasks asynchronously using a scheduler.

If you like what we are doing or if you have question, please join us on discord.

Cheers,

Christoph

References

Bandsaw

Multimeter

Bandsaw now with SSH support

Friday, November 12, 2021

Shortly after our initial release, we are proud to announce the next milestone for our workflow library Bandsaw. With our today’s release of version 0.2, we provide now the first advice that is capable of transferring tasks to remote machines and running them there. This enables our users to use bandsaw in real world processes that depend on platforms that are not the development workstations of developers.

Bandsaw allows defining multiple different remote interpreters in its configuration and provides an easy way of distributing tasks to these remote machines. The tasks and their data including the necessary code to run them are distributed using the SSH command line tools underneath, so bandsaw doesn’t require new python dependencies. This allows bandsaw to automate parts of the process that would previously be done manually, such as:

  • Packaging and transferring code to remote machine
  • Manually logging in to remote system and running the task
  • Copying input and output data of the task between developer machine and computation platform

The new version has been published on PyPI, so running pip install bandsaw should give you the latest version. We updated the latest documentation and included some instructions for how to use the new SSH feature.

With SSH support being ready we continue now with our next two milestones:

  • 0.3: Track computation resource metrics.
  • 0.5: Run tasks asynchronously using a scheduler.

Besides, we are currently working on an additional product that we plan to deliver by the end of the year. Stay tuned! As always, feel free to reach out by opening feature request, sending us emails or drop by for support or a talk in discord.

Cheers,

Christoph

References

First release of new library "Bandsaw"

Friday, October 29, 2021

Finally, we got our first version of one of our products out of the door. Yesterday we successfully built and published version 0.1 of our new library “Bandsaw”.

Bandsaw will be the foundation of our products. It is a python library for splitting up a python workflow into individual tasks, that can be run separately and on different platforms. The idea behind it is that breaking a complete workflow into smaller pieces allows to optimize each individual task independently of each other.

Our initial release is not feature complete yet, but it should allow to give some idea about the direction we are heading. With 0.1 the library supports already workflows that run on a single machine. Here is a short list of features that are already available:

  • Splitting workflow into smaller tasks
  • Adding additional logging to each task
  • Cache individual task results in file system to speed up multiple runs.
  • Running tasks in sub processes for better isolation
  • Use different python interpreters per task which allows combining different frameworks with conflicting dependencies.

The package has been published on PyPI, so installing it is as easy as running pip install bandsaw. Additionally, we published documentation including a detailed user guide that explains the inner workings of bandsaw. For all of you people who want to “use the source”, the code can be found in our GitLab repository, enjoy. The code is released under the MIT license, so there shouldn’t be any problems with trying it out.

Now, what’s next for bandsaw? There is still some way to go until we are feature complete and can release our version 1.0, but we think that it won’t take us too long with the foundations being in place. Our preliminary roadmap looks like this:

  • 0.2: Support for remote execution of tasks using SSH.
  • 0.3: Track computation resource metrics.
  • 0.5: Run tasks asynchronously using a scheduler.

So much for now and our plans. Have fun and don’t hesitate to reach out for questions, ideas or all kind of feedback.

Cheers,

Christoph

References