“…Kedro-Viz is a way for us to have a meeting of minds and be able to problem solve with each other. It helps me have conversations about what’s in the code and how it’s organised...”
“Kedro-Viz is incredibly valuable for demos even if only used briefly. Without Viz, people would be spending hours preparing bespoke presentations to walk through a pipeline”
These are recurring examples of the value proposition of Kedro-Viz. It enables users to understand data pipelines and connected datasets, fosters collaboration during data modelling between technical and non-technical team stakeholders, and helps the team lead present updates on a project.
Until now, without a technical or “translator” team member, there was no straightforward way to make Kedro-Viz accessible to non-technical users on a project team. The only options were to screenshot a PNG version of a Kedro-Viz pipeline, make a GIF or create a hosted Kedro-Viz. The first two options meant that users lost the interactivity of Kedro-Viz, and the final option required infrastructure setup and coding.
Recent research by our team indicated that almost half of Kedro users wanted to share a version of their pipeline visualisation for others to explore. Crucial to the use case is to be able to share with non-technical stakeholders who cannot make a copy of the Kedro project, install and run Kedro-Viz to see the pipeline
We recently found a way to address this pain point and launched a way to publish and share a visualisation in Kedro-Viz 6.6.0. The new feature enables users to share their pipeline visualisation with other stakeholders by hosting a Kedro-Viz project on Amazon S3. Team members and senior stakeholders can now view, explore and interrogate updates of the project visualisation to provide feedback to the team.
How to use publish and share
You can host your Kedro-Viz project on Amazon S3. You must first create an S3 bucket and credentials, and then enable static website hosting.
Update and install the dependencies
Kedro-Viz requires specific minimum versions of fsspec[s3]
, and kedro
to publish your project. You can ensure you have these correct versions by updating the requirements.txt
file in the src
folder of the Kedro project to the following:
1fsspec[s3]>=2023.9.0
2kedro>=0.18.2
Install the dependencies from the project root directory by typing the following in your terminal:
1pip install -r src/requirements.txt
Configure your AWS S3 bucket and set credentials
You can host your Kedro-Viz project on Amazon S3. You must first create an S3 bucket and then enable static website hosting. To do so, follow the AWS tutorial to configure a static website on Amazon S3. Once the S3 bucket is created, you'll need to create an Identity and Access Management (IAM) user account, user group, and generate the corresponding access keys. To do so, first sign in to the AWS Management Console and create an IAM user account. For more information, see the official AWS documentation about IAM Identities.
Create a user group from the IAM dashboard, ensuring the user group has full access to the AWS S3 policy. For more information, see the official AWS documentation about IAM user groups.
Add the IAM user to the user group (this is only possible if the group has been created).
Select the user, then select Create access key
. Follow the steps and create your keys.
Once that's completed, you'll need to set your AWS credentials as environment variables in your terminal window, as shown below:
1export AWS_ACCESS_KEY_ID="your_access_key_id"
2export AWS_SECRET_ACCESS_KEY="your_secret_access_key"
For more information, see the official AWS documentation about how to work with credentials.
Publish and share the project
You're now ready to publish and share your Kedro-Viz project. Start Kedro-Viz by running the following command in your terminal:
1kedro viz
Click the Publish and share icon in the lower-left of the application. You will see a modal dialog to select your relevant AWS Bucket Region and enter your Bucket Name. Once those two details are complete, click Publish. A hosted, shareable URL will be returned to you after the process completes. Here's an example of the flow:
Permissions and access control
All permissions and access control are controlled by AWS. It's up to you, the user, if you want to allow anyone to see your project or limit access to certain IP addresses, users, or groups. You can control who can view your visualisation using bucket and user policies or access control lists. See the official AWS documentation for more information.
You pay for storing objects in your S3 buckets. The amount you pay depends on your objects’ size, how long you stored the object during the month, and the storage class. See the official AWS documentation for more information.
Hosting responsilities
As Kedro-Viz is an open-source application, the team behind the development of this feature needed to make some technical tradeoffs. For example, from the outset, the answer to the question, “Who’s in charge of hosting the application, the Kedro team or the user?”, was the latter. As we are limited in scope, the Kedro-Viz team just could not incur any financial costs, maintenance overheads, or other hosting realities.
The next tradeoff was how to enable a user to host a Kedro-Viz? We wanted the users to be able to easily publish and share their instance of the app from the Kedro-Viz UI, which led us to explore software development kits from the major cloud computing players (Amazon’s AWS, Google’s GCP, and Microsoft’s Azure). We knew from our own user research that a large share of our users already used AWS elsewhere in their projects and further, we as a team had previously built the collaborative experiment tracking feature using AWS infrastructure.
So, with this knowledge, we chose AWS as the hosting provider for our users to publish and share Kedro-Viz. The feature enables hosting via Simple Storage Solution (a.k.a. S3) , and users maintain full control over the set up, access and configuration of their published Kedro-Viz project.
Summary
Publish and Share Kedro-Viz enables users to easily share their pipeline visualisation with other stakeholders, by hosting their Kedro-Viz project on Amazon S3. it facilitates collaboration and project debugging amongst all stakeholders, and the onboarding of new team members.
This article described this feature, its setup, and development. You can learn more from our documentation. The next step for us is to extend the feature to enable sharing via the command line, and offer the option to deploy onto GitHub pages and other platforms (beyond AWS).
Find out more about Kedro-Viz
Kedro-Viz is an interactive development tool for building and visualising data science pipelines with Kedro. It enables you to monitor the status of your ML project, present it to stakeholders, and easily bring new team members up to speed. You can try it out using our hosted demo.
If you are new to Kedro-Viz you can learn more about the product with this video tutorial series.