Ever wondered what it's like to become a wizard in the realm of data labeling? Allow me to demystify how I created the most viewed webpage on the Label Studio website. This story involves less battling dark lords (unless you consider pesky bugs as such) and more unraveling the magic of machine learning to an entirely new audience.

Welcome to the world of data labeling and one of the most popular open-source tools.

Picture this: you’re standing at the base of a mountain — Mount Label Studio. And at the top is a solid understanding of the importance of data labeling and why it matters to the larger Machine Learning Mountain Range. But you have no idea how even to begin the climb. This was the intimidating experience many users (and potential users) were having when they visited the Label Studio website.

Due to the fever behind machine learning, there’s a rapidly expanding need for introductory content. My directive was to guide new users with a friendly helping hand, with the goal of further democratizing this intimidating process.

Like a cool teacher who makes learning fun, I aimed to throw boring textbooks out the window and break down the complex jargon into an entertaining syllabus that even the most code-averse or intimidated folks would enjoy. And judging by the crowd visiting our "Getting Started with Label Studio" page, it's safe to say that goal was accomplished. This guide became the most-viewed page on our site by a long shot and remains the most-viewed page months after launch.

In this step-by-step tutorial, I covered everything: from what Label Studio is, to its use cases, to how to install it, and how to work with your first data set. We even went through labeling images and exporting your annotations.

My biggest win? You don’t need to be a seasoned developer or data scientist to understand it. A personal highlight to me was chatting with a community member who was learning about machine learning through ChatGPT, found Label Studio and this blog, and got it running — without previously writing a single line of code.

So, whether you're a complete newbie just getting started or a pro-developer looking to expand your horizons, this post fulfilled a crucial business need. At the end of the day, through monitoring site metrics, open source registrations, and conversations with those who had followed the tutorial — we could directly point to an increase in product adoption after publication. An added bonus? My teammates in customer success and support now were able to deflect more questions thanks to this resource.


Project Recap

Primary Objective: Reduce friction in the open-source usage process by creating beginner-friendly content for those new to data labeling and machine learning.

Secondary Objective: Understand the Label Studio project and identify current product friction points for entry-level users.

Results:

  • Most-viewed page on the Label Studio website
  • Identified a few bugs and “quick wins” for improvement during the process and communicated back to engineering and product for further investigation and development of the product.
  • Became an essential tutorial shared with beginners and launched the development of further education initiatives, including workshops and demonstrations.
  • Reduced support needs.  Team members now had additional resources to point to when new users in primary target categories were stuck.

Skills used:

  • Technical Documentation
  • Product Feedback
  • User Experience
  • Technical Content Creation
  • Tutorial Development
  • Developer Education + Experience
  • Python
  • Docker
  • Data Cleanup

Blog Link (Archived August 2023)